sed - Addressing using two strings

sed - Addressing using two strings



I am picking up sed. I am having a trouble understanding how the line addressing in sed works when a pattern is used to specify line address.



I have a sample text file named emp.lst with the following contents:


emp.lst


2233|a.k. shukla |g.m. |sales |12/12/52|6000
9876|jai sharma |director |production|12/03/50|7000
5678|sumit chakrobarty|d.g.m. |marketing |19/04/43|6000
2365|barun sengupta |director |personnel |11/05/47|7800
5423|n.k. gupta |chairman |admin |30/08/56|5400
1006|chanchal singhvi |director |sales |03/09/38|6700
6213|karuna ganguly |g.m. |accounts |05/06/62|6300
1265|s.n. dasgupta |manager |sales |12/09/63|5600
4290|jayant Choudhury |executive|production|07/09/50|6000
2476|anil aggarwal |manager |sales |01/05/59|5000
6521|lalit chowdury |director |marketing |26/09/45|8200
3212|shyam saksena |d.g.m. |accounts |12/12/55|6000
3564|sudhir Agarwal |executive|personnel |06/07/47|7500
2345|j.b. saxena |g.m. |marketing |12/03/45|8000
0110|v.k. agrawal |g.m. |marketing |31/12/40|9000



As I understand, line address can be specified either in the form of line number(s) of a pattern to match as text or regular expression.



I understand how sed -n '1p' emp.lst and sed -n '1,2p' emp.lst print line 1 and line 1 & 2 respectively without echoing all lines (-n).


sed -n '1p' emp.lst


sed -n '1,2p' emp.lst


-n



I also understand and appreciate how sed -n '/director/p' emp.lst match all the lines containing the string director, and outputs:


sed -n '/director/p' emp.lst


director


9876|jai sharma |director |production|12/03/50|7000
2365|barun sengupta |director |personnel |11/05/47|7800
1006|chanchal singhvi |director |sales |03/09/38|6700
6521|lalit chowdury |director |marketing |26/09/45|8200



Now, when I specify multiple patters as sed -n '/director/,/executive/p' emp.lst, the output shown is:


sed -n '/director/,/executive/p' emp.lst


9876|jai sharma |director |production|12/03/50|7000
5678|sumit chakrobarty|d.g.m. |marketing |19/04/43|6000
2365|barun sengupta |director |personnel |11/05/47|7800
5423|n.k. gupta |chairman |admin |30/08/56|5400
1006|chanchal singhvi |director |sales |03/09/38|6700
6213|karuna ganguly |g.m. |accounts |05/06/62|6300
1265|s.n. dasgupta |manager |sales |12/09/63|5600
4290|jayant Choudhury |executive|production|07/09/50|6000
6521|lalit chowdury |director |marketing |26/09/45|8200
3212|shyam saksena |d.g.m. |accounts |12/12/55|6000
3564|sudhir Agarwal |executive|personnel |06/07/47|7500



What does this output represent?



Is it all lines containing the pattern director and executive? Clearly no, as there are some lines not containing either one of the patterns.


director


executive



Is it all lines starting with first one matching either one of the patters till the last one matching either one of the patterns? No again, as if I go by that logic, one line (2476|anil aggarwal |manager |sales |01/05/59|5000) is missing from the output.


2476|anil aggarwal |manager |sales |01/05/59|5000



I have not been able to clearly deduce how the command sed -n '/director/,/executive/p' emp.lst is working? I have gone through the sed man page and have yet been unable to deduce.


sed -n '/director/,/executive/p' emp.lst



How do I approach understanding the working?



For context, I am running sed command built into macOS High Sierra 10.13.6 running in Bash version 4.4.


sed



Note: I am a sed newbie. Please edit any mistake or incorrect terminology that I may have used.


sed




2 Answers
2



https://www.gnu.org/software/sed/manual/sed.html#Range-Addresses:



An address range can be specified by specifying two addresses separated by a comma (,). An address range matches lines starting from where the first address matches, and continues until the second address matches (inclusively):


,


$ seq 10 | sed -n '4,6p'
4
5
6



Thus 1,2p does not mean "print lines 1 and 2" but "print all lines between line 1 and line 2". The difference becomes more clear with e.g. 3,7p, which will not just print line 3 and 7, but lines 3, 4, 5, 6, 7.


1,2p


3,7p



/director/,/executive/p prints all lines between a starting line (matching director) and an ending line (matching executive).


/director/,/executive/p


director


executive



In your case, you have two matching ranges (each starting with director and ending with executive):


director


executive






Thanks a lot. Your explanation was very helpful and I can now clearly understand the working. I experimented with various placement(s) of director and executive within the file. man pages although logically precise and concise are generally hard to understand for newbie.

– Nimesh Neema
Sep 9 '18 at 23:39


director


executive






I have a follow up query and felt it makes more sense to ask here than to write an altogether separate question. Is there a way to select just the first/last/any intermediate one of the many matching blocks?

– Nimesh Neema
Sep 10 '18 at 10:47




From man sed:


man sed


0,addr2
Start out in "matched first address" state, until addr2 is found.
This is similar to 1,addr2, except that if addr2 matches the very
first line of input the 0,addr2 form will be at the end of its range,
whereas the 1,addr2 form will still be at the beginning of its range.
This works only when addr2 is a regular expression.



Not 100% sure if this is the manual section that applies but it looks like you have 2 blocks from "director" to "executive" in your output above.
There happen to be some other "director" lines between the first "director" and first succeeding "executive".






That section does not apply. There is no 0, in OP's code.

– melpomene
Sep 9 '18 at 23:31


0,



Thanks for contributing an answer to Stack Overflow!



But avoid



To learn more, see our tips on writing great answers.



Required, but never shown



Required, but never shown




By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

𛂒𛀶,𛀽𛀑𛂀𛃧𛂓𛀙𛃆𛃑𛃷𛂟𛁡𛀢𛀟𛁤𛂽𛁕𛁪𛂟𛂯,𛁞𛂧𛀴𛁄𛁠𛁼𛂿𛀤 𛂘,𛁺𛂾𛃭𛃭𛃵𛀺,𛂣𛃍𛂖𛃶 𛀸𛃀𛂖𛁶𛁏𛁚 𛂢𛂞 𛁰𛂆𛀔,𛁸𛀽𛁓𛃋𛂇𛃧𛀧𛃣𛂐𛃇,𛂂𛃻𛃲𛁬𛃞𛀧𛃃𛀅 𛂭𛁠𛁡𛃇𛀷𛃓𛁥,𛁙𛁘𛁞𛃸𛁸𛃣𛁜,𛂛,𛃿,𛁯𛂘𛂌𛃛𛁱𛃌𛂈𛂇 𛁊𛃲,𛀕𛃴𛀜 𛀶𛂆𛀶𛃟𛂉𛀣,𛂐𛁞𛁾 𛁷𛂑𛁳𛂯𛀬𛃅,𛃶𛁼

Crossroads (UK TV series)

ữḛḳṊẴ ẋ,Ẩṙ,ỹḛẪẠứụỿṞṦ,Ṉẍừ,ứ Ị,Ḵ,ṏ ṇỪḎḰṰọửḊ ṾḨḮữẑỶṑỗḮṣṉẃ Ữẩụ,ṓ,ḹẕḪḫỞṿḭ ỒṱṨẁṋṜ ḅẈ ṉ ứṀḱṑỒḵ,ḏ,ḊḖỹẊ Ẻḷổ,ṥ ẔḲẪụḣể Ṱ ḭỏựẶ Ồ Ṩ,ẂḿṡḾồ ỗṗṡịṞẤḵṽẃ ṸḒẄẘ,ủẞẵṦṟầṓế