Difference between revisions of "Sed"

From Blue-IT.org Wiki

(Cut a certain passage out of a file)
(Cut a certain passage out of a file)
 
(3 intermediate revisions by the same user not shown)
Line 32: Line 32:
 
</h1>
 
</h1>
 
<div id=content>
 
<div id=content>
  '''<div id=abstract>'''
+
  <div id=abstract>
   '''Lorem ipsum dolor sit amet, consetetur sadipscing elitr,'''
+
   Lorem ipsum dolor sit amet, consetetur sadipscing elitr,
   '''sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat,'''
+
   sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat,
   '''sed diam voluptua.'''  
+
   sed diam voluptua.'''  
   '''</div>'''
+
   </div>
 
  At vero eos et accusam et justo duo dolores et ea rebum.  
 
  At vero eos et accusam et justo duo dolores et ea rebum.  
 
  Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.  
 
  Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.  
Line 47: Line 47:
 
</nowiki></pre>
 
</nowiki></pre>
  
We like to cut out the ''abstract'' and get:
+
We like to '''cut out the <abstract>''' and get:
  
 
<pre><nowiki>
 
<pre><nowiki>

Latest revision as of 12:56, 8 June 2016

Save search and replace

The following snippet does a save - in term of special characters - search and replace with sed.

search="$(printf "%s\n" "SEARCHSTRING" | sed 's/[][\.*^$/]/\\&/g')"
replace="$(printf "%s\n" "REPLACESTRING" | sed 's/[][\.*^$/]/\\&/g')"
sed "s/${search}/${replace}/g" "${DOC}"

Or, if you prefere to use a function:

# Search and replace text in a document
# param1: search string
# param2: replace string
# param3: the document URL
search_replace(){

   search="$(printf "%s\n" "${1}" | sed 's/[][\.*^$/]/\\&/g')"
   replace="$(printf "%s\n" "${2}" | sed 's/[][\.*^$/]/\\&/g')"
   sed -i "s/${search}/${replace}/g" "${3}"

}

[...]

search_replace "SEARCHSTRING" "REPLACESTRING" "DOCUMENT_URL"

Cut a certain passage out of a file

Let's assume we have the following sample text:

<h1 class = title>
 Lorem ipsum
</h1>
<div id=content>
   <div id=abstract>
   Lorem ipsum dolor sit amet, consetetur sadipscing elitr,
   sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat,
   sed diam voluptua.''' 
   </div>
 At vero eos et accusam et justo duo dolores et ea rebum. 
 Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. 
 Lorem ipsum dolor sit amet, consetetur sadipscing elitr, 
 sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat,
 sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. 
 Stet clita kasd gubergren, 
 no sea takimata sanctus est Lorem ipsum dolor sit amet.
</div>

We like to cut out the <abstract> and get:

<h1 class = title>
 Lorem ipsum
</h1> 
<div id=content>
 At vero eos et accusam et justo duo dolores et ea rebum. 
 Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. 
 Lorem ipsum dolor sit amet, consetetur sadipscing elitr, 
 sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat,
 sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. 
 Stet clita kasd gubergren, 
 no sea takimata sanctus est Lorem ipsum dolor sit amet.
</div>

The next snippet searches for the first (!) - and only the first - occurence of the Expression "FIRST_TEXT_FRAGMENT" and then, from that line on to the first occurence of "LAST_TEXT_FRAGMENT". It then cuts the lines between these two - including the found lines (!) - out with sed.

ARTICLE="${1}"

FIRST_TEXT_FRAGMENT="<div id=abstract>"
LAST_TEXT_FRAGMENT="</div>"

cat "${ARTICLE}" > tmp1
BEGIN_CUT=$(cat tmp1 | fgrep -n --max-count=1 "${FIRST_TEXT_FRAGMENT}" | cut -d':' -f1)
BEGIN_CUT=$(expr ${BEGIN_CUT})

END_CUT=$(cat tmp1 | sed -e "1,${BEGIN_CUT}d" | fgrep -n --max-count=1 "${LAST_TEXT_FRAGMENT}" | cut -d':' -f1)
END_CUT=$(expr ${END_CUT} + ${BEGIN_CUT})

cat tmp1 | sed -e "${BEGIN_CUT},${END_CUT}d" > tmp2


Search for a string and replace the beginning of a line

sed -n '/SEARCH_STRING/{s|^|//|};p' myfile.php
sed '/SEARCH_STRING/s|^|//|' file