資源簡(jiǎn)介
文件用于計(jì)算fasta文件中基因序列的N50、基因條數(shù)、最短最長(zhǎng)的序列條數(shù)。將腳本文件拷貝至fasta文件目錄下,使用方法:python cal_N50.py
跳出“Enter your fasta/fa name: ”后,輸入你當(dāng)前目錄下的fasta文件名后回車即可

代碼片段和文件信息
#GC_N50.py
print?‘Python?and?Biopython?needed?for?running?this?script!‘
print?“script?for?calculating?N50?of?assembly“
fasta?=?raw_input(‘Enter?your?fasta/fa?name:?‘)
#?N50?calculation
baseSumLength=?0[]
ValueSumN50?=?00
no_cno_gno_ano_tno_n?=?00000
from?Bio?import?SeqIO
for?record?in?SeqIO.parse(open(fasta)?“fasta“):
???baseSum?+=?len(record.seq)
???Length.append(len(record.seq))
???seq?=record.seq.lower()
???no_c+=seq.count(‘c‘)
???no_g+=seq.count(‘g‘)
???no_a+=seq.count(‘a(chǎn)‘)
???no_t+=seq.count(‘t‘)
???no_n+=seq.count(‘n‘)
#N50?calcuation
N50_pos?=?baseSum?/?2.0????
Length.sort()
Length.reverse()????
for?value?in?Length:
???ValueSum?+=?value
???if?N50_pos?<=?ValueSum:
???????N50?=?value
???????break??
print?‘Sequences?NO.:‘+‘t‘+str(len(Length))
print?‘Sequences?Min.:‘+‘t‘+str(min(Length))
print?‘Sequences?Max.:‘+‘t‘+str(max(Length))
print?‘N50:?‘?+?str(N50)
?屬性????????????大小?????日期????時(shí)間???名稱
-----------?---------??----------?-----??----
?????文件?????????278??2020-11-17?08:48??浣跨敤鏂規(guī)硶.txt
?????目錄???????????0??2020-11-17?08:48??__MACOSX\
?????文件?????????210??2020-11-17?08:48??__MACOSX\._浣跨敤鏂規(guī)硶.txt
?????文件?????????885??2020-11-16?23:52??璁$畻N50鐨刾ython鑴氭湰.py
?????文件?????????613??2020-11-16?23:52??__MACOSX\._璁$畻N50鐨刾ython鑴氭湰.py
評(píng)論
共有 條評(píng)論