grep doesn't output until EOF if piped through cat

grep doesn't output until EOF if piped through cat



Given this minimal example


( echo "LINE 1" ; sleep 1 ; echo "LINE 2" ; )



it outputs LINE 1 and then, after one second, outputs LINE 2, as expected.


LINE 1


LINE 2



If we pipe this to grep LINE


grep LINE


( echo "LINE 1" ; sleep 1 ; echo "LINE 2" ; ) | grep LINE



the behavior is the same as in the previous case, as expected.



If, alternatively, we pipe this to cat


cat


( echo "LINE 1" ; sleep 1 ; echo "LINE 2" ; ) | cat



the behavior is again the same, as expected.



However, if we pipe to grep LINE, and then to cat,


grep LINE


cat


( echo "LINE 1" ; sleep 1 ; echo "LINE 2" ; ) | grep LINE | cat



there is no output until one second passes, and both lines appear on the output immediately, which I did not expect.



Why is this happening and how can I make the last version to behave in the same way as the first three commands?





cat concatenates files. What are you trying to do by piping into cat?
– Douglas Held
Sep 5 '18 at 20:09


cat


cat





@DouglasHeld When called without arguments, cat simply reads stdin and outputs into stdout. Of course, I came up with this question with a lot of complex stuff in place of echo and cat, but these turned out to be irrelevant, since the problem shows up with much simpler examples.
– lisyarus
Sep 5 '18 at 21:11


cat


stdin


stdout


echo


cat





@DouglasHeld: Piping to cat is often useful to force stdout to not be a terminal. For instance, this is an easy way to get many commands to not use colorized output.
– wchargin
Sep 7 '18 at 5:01





I swear this is a dupliciate of another question on Stack Overflow!
– iBug
Sep 7 '18 at 6:46






@wchargin thank you very much, you have taught me something new about posix that I never knew.
– Douglas Held
Oct 12 '18 at 22:01




3 Answers
3



When (at least GNU) grep’s output is not a terminal, it buffers its output, which is what causes the behaviour you’re seeing. You can disable this either using GNU grep’s --line-buffered option:


grep


grep


--line-buffered


( echo "LINE 1" ; sleep 1 ; echo "LINE 2" ; ) | grep --line-buffered LINE | cat



or the stdbuf utility:


stdbuf


( echo "LINE 1" ; sleep 1 ; echo "LINE 2" ; ) | stdbuf -oL grep LINE | cat



Turn off buffering in pipe has more on this topic.



Simplified explanation



Like many utilities, this not being something peculiar to one program, grep varies its standard output between being line buffered and fully buffered. In the former case, the C library buffers output data in memory until either the buffer holding those data is filled or a linefeed character is added to it (or the program ends cleanly), whereupon it calls write() to actually write the buffer contents. In the latter case, only the in-memory buffer becoming full (or the program ending cleanly) triggers the write().


grep


write()


write()



More detailed explanation



This is the well-known, but slightly wrong, explanation. In fact, standard output is not line buffered but smart buffered in the GNU C library and BSD C library. Standard output is also flushed when reading standard input exhausts its in-memory buffer (of pre-read input) and the C library has to call read() to fetch some more input and it is reading the beginning of a new line. (One reason for this is to prevent deadlock when another program connects itself to both ends of a filter and expects to be able to operate line-by-line, alternating between writing to the filter and reading from it; like "coprocesses" in GNU awk for example.)


read()


awk



C library influence



grep and the other utilities do this — or, more strictly, the C libraries that they use do this, because this is a defined feature of programming in the C language — based upon what they detect their standard output to be. If (and only if) it is not an interactive device, they choose full buffering, otherwise they choose smart buffering. A pipe is considered to be not an interactive device, because the definition of being an interactive device, at least in the world of Unix and Linux, is essentially the isatty() call returning true for the relevant file descriptor.


grep


isatty()



Workarounds to disable full buffering



Some utilities like grep have idiosyncratic options such as --line-buffered that change this decision, which as you can see is mis-named. But a vanishingly small fraction of the filter programs that one could use actually have such an option.


grep


--line-buffered



More generally, one can use tools that dig into the specific internals of the C library and change its decision making (which have security problems if the program to be altered is set-UID, and are also specific to particular C libraries, and indeed are specific to programs written in or layered on top of the C language), or tools such as ptybandage that do not change the internals of the program but simply interpose a pseudo-terminal as standard output so that the decision comes out as "interactive", to affect this.


ptybandage



Further reading





Underrated answer. Thanks for the info!
– Délisson Junio
Sep 5 '18 at 17:37





If the phrase "line buffered" is a misnomer, then it's not really the fault of grep, but of the underlying library calls, setbuf/setvbuf. I don't know of a reliable online reference for the C standard, but e.g. the Linux and FreeBSD man pages along with the POSIX description of setvbuf call it "line buffered". Even the symbolic constant for it is _IOLBF.
– ilkkachu
Sep 5 '18 at 21:19


grep


setbuf


setvbuf


setvbuf


_IOLBF





Well now you've learned better. This buffering strategy is described in the GNU C library doco, albeit briefly. Laurent Bercot is more forthright on the matter. I have mentioned it too.
– JdeBP
Sep 6 '18 at 0:35





@ilkkachu The C standard does indeed use "line buffered". Per 7.21.3 Files, paragraph 3: "When a stream is unbuffered, ... When a stream is fully buffered, ... When a stream is line buffered, characters are intended to be transmitted to or from the host environment as a block when a new-line character is encountered. ..." In fact, the C Standard uses the exact phrase "line buffered" five times. So it's not a misnomer.
– Andrew Henle
Sep 6 '18 at 14:41






Furthermore, the approach described here as "smart buffering", as I understand it, seems to be just what the C standard describes as "line buffering". Specifically, in addition to flushing the buffer at newlines, "When a stream is line buffered, characters are intended to be transmitted to or from the host environment as a block when [...] input is requested on an unbuffered stream, or when input is requested on a line buffered stream that requires the transmission of characters from the host environment." So this is not a GNU or BSD quirk, but rather what the language calls for.
– John Bollinger
Sep 6 '18 at 22:44




Use


grep --line-buffered



to make grep not buffer more than one line at a time.



Thanks for contributing an answer to Unix & Linux Stack Exchange!



But avoid



To learn more, see our tips on writing great answers.



Some of your past answers have not been well-received, and you're in danger of being blocked from answering.



Please pay close attention to the following guidance:



But avoid



To learn more, see our tips on writing great answers.



Required, but never shown



Required, but never shown




By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

𛂒𛀶,𛀽𛀑𛂀𛃧𛂓𛀙𛃆𛃑𛃷𛂟𛁡𛀢𛀟𛁤𛂽𛁕𛁪𛂟𛂯,𛁞𛂧𛀴𛁄𛁠𛁼𛂿𛀤 𛂘,𛁺𛂾𛃭𛃭𛃵𛀺,𛂣𛃍𛂖𛃶 𛀸𛃀𛂖𛁶𛁏𛁚 𛂢𛂞 𛁰𛂆𛀔,𛁸𛀽𛁓𛃋𛂇𛃧𛀧𛃣𛂐𛃇,𛂂𛃻𛃲𛁬𛃞𛀧𛃃𛀅 𛂭𛁠𛁡𛃇𛀷𛃓𛁥,𛁙𛁘𛁞𛃸𛁸𛃣𛁜,𛂛,𛃿,𛁯𛂘𛂌𛃛𛁱𛃌𛂈𛂇 𛁊𛃲,𛀕𛃴𛀜 𛀶𛂆𛀶𛃟𛂉𛀣,𛂐𛁞𛁾 𛁷𛂑𛁳𛂯𛀬𛃅,𛃶𛁼

Edmonton

Crossroads (UK TV series)