sed '/^[^>]/ y/uT/tU/' combine_all.full.fa > combine_all.full.RNA.fa
[zhangqf7@loginview02 RIP]$ grep -v "^tx" predict_MO_decay.overlap.inall.motif.decay.txt|head
NM_001077537 3657 3663 0.2578571428571429 common sphere
NM_001077537 3657 3663 0.19057142857142856 common shield
NM_201059 2029 2035 0.2177142857142857 common sphere
NM_201059 2029 2035 0.15457142857142858 common shield
NM_201059 1880 1886 0.01 common sphere
NM_201059 1880 1886 0.022857142857142857 common shield
NM_001077537 3425 3431 0.1821428571428572 common sphere
NM_001077537 3425 3431 0.15214285714285716 common shield
NM_001256642 1533 1539 0.07742857142857143 common sphere
NM_001256642 1533 1539 0.11457142857142855 common shield
[zhangqf7@loginview02 RIP]$ grep -v "^tx" predict_MO_decay.overlap.inall.motif.decay.txt|head|sed 'N;s/\n/\t/'
NM_001077537 3657 3663 0.2578571428571429 common sphere NM_001077537 3657 3663 0.19057142857142856 common shield
NM_201059 2029 2035 0.2177142857142857 common sphere NM_201059 2029 2035 0.15457142857142858 common shield
NM_201059 1880 1886 0.01 common sphere NM_201059 1880 1886 0.022857142857142857 common shield
NM_001077537 3425 3431 0.1821428571428572 common sphere NM_001077537 3425 3431 0.15214285714285716 common shield
NM_001256642 1533 1539 0.07742857142857143 common sphere NM_001256642 1533 1539 0.11457142857142855 common shield
subprocess.call(['''grep -v "^tx" %s|sed "N;s/\n/\t/" > %s'''%(savefn, savefn.replace('.txt', '.merge.txt'))], shell=True)
cause error: sed: -e expression #1, char 2: extra characters after command
, beacuse there is slash \
in the regular expression, which should be escaped as below, now it works:
subprocess.call(['''grep -v "^tx" %s|sed "N;s/\\n/\\t/" > %s'''%(savefn, savefn.replace('.txt', '.merge.txt'))], shell=True)
[zhangqf7@loginview02 60]$ ll|awk '{print $9}'|cut -d "_" -f 1-2|sort|uniq|head
NM_001003422
NM_001003429
NM_001004577
NM_001017572
NM_001025520
NM_001025533
NM_001033107
NM_001044395
NM_001045231
[zhangqf7@loginview02 60]$ ll|awk '{print $9}'|cut -d "_" -f 1-2|sort|uniq|head|sed ':a;N;$!ba;s/\n/,/g'
,NM_001003422,NM_001003429,NM_001004577,NM_001017572,NM_001025520,NM_001025533,NM_001033107,NM_001044395,NM_001045231
➜ git:(master) ✗ head test.txt
1 U 0 2 0 1
2 G 1 3 0 2
3 G 2 4 0 3
4 G 3 5 0 4
5 C 4 6 0 5
6 C 5 7 191 6
7 G 6 8 189 7
8 G 7 9 188 8
9 G 8 10 187 9
10 A 9 11 186 10
# 每个[ ]包含一个空格
# *:表示0个或者多个
# 所以>=1个:[ ][ ]*
➜ git:(master) ✗ head test.txt|sed 's/^[ ]*//g'|sed 's/[ ][ ]*/\t/g'
1 U 0 2 0 1
2 G 1 3 0 2
3 G 2 4 0 3
4 G 3 5 0 4
5 C 4 6 0 5
6 C 5 7 191 6
7 G 6 8 189 7
8 G 7 9 188 8
9 G 8 10 187 9
10 A 9 11 186 10