Home > Linux > Main text

Usage of sed command


Tag: grep, awk, sed


convert DNA to RNA (source)

sed '/^[^>]/ y/uT/tU/' combine_all.full.fa > combine_all.full.RNA.fa

combine every two lines (source)

[zhangqf7@loginview02 RIP]$ grep -v "^tx" predict_MO_decay.overlap.inall.motif.decay.txt|head
NM_001077537    3657    3663    0.2578571428571429      common  sphere
NM_001077537    3657    3663    0.19057142857142856     common  shield
NM_201059       2029    2035    0.2177142857142857      common  sphere
NM_201059       2029    2035    0.15457142857142858     common  shield
NM_201059       1880    1886    0.01    common  sphere
NM_201059       1880    1886    0.022857142857142857    common  shield
NM_001077537    3425    3431    0.1821428571428572      common  sphere
NM_001077537    3425    3431    0.15214285714285716     common  shield
NM_001256642    1533    1539    0.07742857142857143     common  sphere
NM_001256642    1533    1539    0.11457142857142855     common  shield
[zhangqf7@loginview02 RIP]$ grep -v "^tx" predict_MO_decay.overlap.inall.motif.decay.txt|head|sed 'N;s/\n/\t/'
NM_001077537    3657    3663    0.2578571428571429      common  sphere  NM_001077537    3657    3663    0.19057142857142856     common  shield
NM_201059       2029    2035    0.2177142857142857      common  sphere  NM_201059       2029    2035    0.15457142857142858     common  shield
NM_201059       1880    1886    0.01    common  sphere  NM_201059       1880    1886    0.022857142857142857    common  shield
NM_001077537    3425    3431    0.1821428571428572      common  sphere  NM_001077537    3425    3431    0.15214285714285716     common  shield
NM_001256642    1533    1539    0.07742857142857143     common  sphere  NM_001256642    1533    1539    0.11457142857142855     common  shield

Calling from python subprocess (source)

subprocess.call(['''grep -v "^tx" %s|sed "N;s/\n/\t/" > %s'''%(savefn, savefn.replace('.txt', '.merge.txt'))], shell=True)

cause error: sed: -e expression #1, char 2: extra characters after command, beacuse there is slash \ in the regular expression, which should be escaped as below, now it works:

subprocess.call(['''grep -v "^tx" %s|sed "N;s/\\n/\\t/" > %s'''%(savefn, savefn.replace('.txt', '.merge.txt'))], shell=True)

Replace new line with coma stackexchange

[zhangqf7@loginview02 60]$ ll|awk '{print $9}'|cut -d "_" -f 1-2|sort|uniq|head

NM_001003422
NM_001003429
NM_001004577
NM_001017572
NM_001025520
NM_001025533
NM_001033107
NM_001044395
NM_001045231
[zhangqf7@loginview02 60]$ ll|awk '{print $9}'|cut -d "_" -f 1-2|sort|uniq|head|sed ':a;N;$!ba;s/\n/,/g'
,NM_001003422,NM_001003429,NM_001004577,NM_001017572,NM_001025520,NM_001025533,NM_001033107,NM_001044395,NM_001045231

替换一个或者多个空格为tab

➜  git:(master)head test.txt
    1 U       0    2    0    1
    2 G       1    3    0    2
    3 G       2    4    0    3
    4 G       3    5    0    4
    5 C       4    6    0    5
    6 C       5    7  191    6
    7 G       6    8  189    7
    8 G       7    9  188    8
    9 G       8   10  187    9
   10 A       9   11  186   10
   
# 每个[ ]包含一个空格
# *:表示0个或者多个
# 所以>=1个:[ ][ ]*
➜  git:(master)head test.txt|sed 's/^[ ]*//g'|sed 's/[ ][ ]*/\t/g'
1       U       0       2       0       1
2       G       1       3       0       2
3       G       2       4       0       3
4       G       3       5       0       4
5       C       4       6       0       5
6       C       5       7       191     6
7       G       6       8       189     7
8       G       7       9       188     8
9       G       8       10      187     9
10      A       9       11      186     10

If you link this blog, please refer to this page, thanks!
Post link:https://tsinghua-gongjing.github.io/posts/sed.html

Previous: 读后感-克莱儿-麦克福尔《摆渡人》