None
Published Histories | sphaeromeria | imported: Curso NGS Pisum 300.000
Import history

Galaxy History ' imported: Curso NGS Pisum 300.000'


DatasetAnnotation
1: pea paired forward REDUCED DATASET from ERR063464_1
239.4 Mb
format: fastq, database: ?
@ERR063464.1 E201_0095:6:1:2672:996#0/1
NCACCAAGCACGACTTTAATTACCATGCCTAAAAACAACTAGACAAAATTTGGAGATTATCAAAAAAAGTCCCATTCAATTTGGATTAGGGATGATTAAA
+
####################################################################################################
@ERR063464.2 E201_0095:6:1:2814:995#0/1
NTTTTCTTAAATCAAACTTGTAAACAAACTTAACTATACTTGACTTAAACTTTCAAAAAGACAAAAAGAACTAACTCATTCAGACCATTTTAGGCCTTTG
2: pea paired reverse REDUCED DATASET from ERR063464_2
239.4 Mb
format: fastq, database: ?
@ERR063464.1 E201_0095:6:1:2672:996#0/2
TNCNGGACAATTTCGGGGCCATATTTGTGATCTACATTGAATGCCGGTTACAACGATATATGATTTGTTCGATTTGGTTGAAAATTGTTCATATCGAATT
+
7#7#8=85>=@B?BE-@DDEF@E<8CD<CC8GGDGG>GEIIIIDIGDA,B2::+4:<=7?BB3--:<45;B92BB=<=3@####################
@ERR063464.2 E201_0095:6:1:2814:995#0/2
CNANACCCAAATATTAAGAAGTTTTCAAATAAAAACTCATAAAAGTCAGAGATCACAGGTAAGGGGGTTGGTTACATAGAGGGACGGGGTCAGCACCCAC
3: FASTQ Groomer on data 1
239.4 MB
format: fastqsanger, database: ?
Info:
Groomed 1000000 sanger reads into sanger reads.
Based upon quality and sequence, the input data is valid for: sanger
Input ASCII range: '#'(35) - 'I'(73)
Input decimal range: 2 - 40
@ERR063464.1 E201_0095:6:1:2672:996#0/1
NCACCAAGCACGACTTTAATTACCATGCCTAAAAACAACTAGACAAAATTTGGAGATTATCAAAAAAAGTCCCATTCAATTTGGATTAGGGATGATTAAA
+
####################################################################################################
@ERR063464.2 E201_0095:6:1:2814:995#0/1
NTTTTCTTAAATCAAACTTGTAAACAAACTTAACTATACTTGACTTAAACTTTCAAAAAGACAAAAAGAACTAACTCATTCAGACCATTTTAGGCCTTTG
4: FASTQ Groomer on data 2
239.4 MB
format: fastqsanger, database: ?
Info:
Groomed 1000000 sanger reads into sanger reads.
Based upon quality and sequence, the input data is valid for: sanger
Input ASCII range: '#'(35) - 'I'(73)
Input decimal range: 2 - 40
@ERR063464.1 E201_0095:6:1:2672:996#0/2
TNCNGGACAATTTCGGGGCCATATTTGTGATCTACATTGAATGCCGGTTACAACGATATATGATTTGTTCGATTTGGTTGAAAATTGTTCATATCGAATT
+
7#7#8=85>=@B?BE-@DDEF@E<8CD<CC8GGDGG>GEIIIIDIGDA,B2::+4:<=7?BB3--:<45;B92BB=<=3@####################
@ERR063464.2 E201_0095:6:1:2814:995#0/2
CNANACCCAAATATTAAGAAGTTTTCAAATAAAAACTCATAAAAGTCAGAGATCACAGGTAAGGGGGTTGGTTACATAGAGGGACGGGGTCAGCACCCAC
5: Filter by quality on data 3
174.3 MB
format: fastqsanger, database: ?
Info:
Quality cut-off: 20
Minimum percentage: 90
Input: 1000000 reads.
Output: 728326 reads.
discarded 271674 (27%) low-quality reads.
@ERR063464.8 E201_0095:6:1:3696:999#0/1
NAATTCAAACCTTTCGATTCTTGAAATTGACATGCGGTTTAGGTAGATCTTGCAACATAGAACACACTGCAACTTGAATCATGCGTTTTCGTTAAGAATT
+
#----77777CCC@@C@@@C@@@@@@C@@C@@@@@58888<<<<<@@@@@@C@@@@C@CC@@@@@@@@@@@C@@@C@@@@@@@@@<::<<::8:::<:<7
@ERR063464.50 E201_0095:6:1:10920:992#0/1
NGTGAATTCCTCATTTGTACCTGAACATTCATCTCTTCGAAAGTCCCAATGTAAAAATCCAAACATGATCATGTGATTAATGGTTGGAAAGGTCTTGATG
6: Filter by quality on data 4
142.2 MB
format: fastqsanger, database: ?
Info:
Quality cut-off: 20
Minimum percentage: 90
Input: 1000000 reads.
Output: 594021 reads.
discarded 405979 (40%) low-quality reads.
@ERR063464.7 E201_0095:6:1:3604:993#0/2
ANCNTTCTTTCCCCTGCCTGAGTCTATCCTTGATATGTTCATCTTAATCGATGACGGATATTCCCATCCTTGGTATTCTACCTAGTAAAAAGGTAGTTGT
+
:#:#<BBBB?IIIIB-GGGGGGDG>GGGGGIIIIIIIIHIIIIIHIIHIIGIBIIIFHFFBGEGEIBIEDCEE>EECEEEDBBFEBD8CE2A?3??9=?9
@ERR063464.9 E201_0095:6:1:3964:1000#0/2
GNANAAAGATCCTTAATGAGAAGAAGGCAATCAACATACTCCAAAGCGCTCTTAGTATGGACGAGTTCTTCCGTATATCTCAATGTAAATCGGCACAGGA
7: FASTQ to FASTA on data 5
715,402 sequences
format: fasta, database: ?
Info:
Input: 728326 reads.
Output: 715402 reads.
discarded 12924 (1%) low-quality reads.
>ERR063464.319 E201_0095:6:1:3573:1035#0/1
ACTGGATCTGATTTTAATTAACTCAAACAATGTAATCTTATTCTATTTCATTTCATATCACTTCATATCATCTCATTGCATCAATTTAAGCAAAACCTTA
>ERR063464.322 E201_0095:6:1:4255:1031#0/1
GAAGGTCAATCTAGCTATAAATGGAAAATGCATGCCTTAAGGTGTAATAATCACTGCCTCAATGCCACAACCATATGAATTAACAGCTCCATCAAATACC
>ERR063464.327 E201_0095:6:1:5072:1033#0/1
ATTGCCAGAGCATACTACCTCAACCAACCTCTGAATCTTCATATTCTCACTCCTTGAAGGAACTGCCACTTCAGAGCATGGTGTTTGAATAACAAGATTC
8: FASTQ to FASTA on data 6
550,961 sequences
format: fasta, database: ?
Info:
Input: 594021 reads.
Output: 550961 reads.
discarded 43060 (7%) low-quality reads.
>ERR063464.902 E201_0095:6:1:3333:1081#0/2
GGAAGGGTTAGGGTTTTAGGATCAAAGCTAGCGCGAATGTGTGTTGAATTATGAGCATATGGAGCACTATTTATAGTTGGATCATGTGATATTTACACCA
>ERR063464.1050 E201_0095:6:1:2937:1090#0/2
TTTGTTTCCATCATCATATAGTGAAGTCCAACAAACCTGGATATCCTAGTTTGCATCTTTAATTGCTTAACCGCTTATACTTGACTTGATATTAAGGACA
>ERR063464.1054 E201_0095:6:1:3385:1085#0/2
TCTCTGTCAGCATCTCCATCTTGTTGAGATTAGCCAAGTATTTGGTGGTACTGTGTTATCGGTCTCGCGCCTTTGAGTTATTTTTGTCAACATAATTATA
9: interlaced paired reads from datasets 7 and 8
946,848 sequences
format: fasta, database: ?
>ERR063464.902 E201_0095:6:1:3333:1081#0/1
AATTTTTTCAATATGGTGAAATTTCCAAACTTCTACATCAAAATTCATCATGATACAAGCTTCAAATAGAAAAGTGTTGAACATGAAAGTTGTTCCTCTT
>ERR063464.902 E201_0095:6:1:3333:1081#0/2
GGAAGGGTTAGGGTTTTAGGATCAAAGCTAGCGCGAATGTGTGTTGAATTATGAGCATATGGAGCACTATTTATAGTTGGATCATGTGATATTTACACCA
>ERR063464.1050 E201_0095:6:1:2937:1090#0/1
TCATATAAGCTCTTTGAAATAATCATGTGGTCTTCATCTCAAGGAAAGACTTGCTCTTTTGCACTGAGGATTCTCTTAATGCCAATTCTGACTGCCATGA
10: reads without available pair reads from datasets 7 and 8
319,515 sequences
format: fasta, database: ?
>ERR063464.319 E201_0095:6:1:3573:1035#0/1
ACTGGATCTGATTTTAATTAACTCAAACAATGTAATCTTATTCTATTTCATTTCATATCACTTCATATCATCTCATTGCATCAATTTAAGCAAAACCTTA
>ERR063464.322 E201_0095:6:1:4255:1031#0/1
GAAGGTCAATCTAGCTATAAATGGAAAATGCATGCCTTAAGGTGTAATAATCACTGCCTCAATGCCACAACCATATGAATTAACAGCTCCATCAAATACC
>ERR063464.327 E201_0095:6:1:5072:1033#0/1
ATTGCCAGAGCATACTACCTCAACCAACCTCTGAATCTTCATATTCTCACTCCTTGAAGGAACTGCCACTTCAGAGCATGGTGTTTGAATAACAAGATTC
11: Random selection from dataset 9, sample size 300000)
300,000 sequences
format: fasta, database: ?
>ERR063464.1050 E201_0095:6:1:2937:1090#0/1
TCATATAAGCTCTTTGAAATAATCATGTGGTCTTCATCTCAAGGAAAGACTTGCTCTTTTGCACTGAGGATTCTCTTAATGCCAATTCTGACTGCCATGA
>ERR063464.1050 E201_0095:6:1:2937:1090#0/2
TTTGTTTCCATCATCATATAGTGAAGTCCAACAAACCTGGATATCCTAGTTTGCATCTTTAATTGCTTAACCGCTTATACTTGACTTGATATTAAGGACA
>ERR063464.1215 E201_0095:6:1:4179:1099#0/1
TCAGGTCCTTTGAAATTTTAAGCAAAACAATTTTGTGGGCATTATAACATAGCCAACATGGTCGAACAGATCTTGGCTCAGAATGGCCTCAATGTGGGAC
12: Archive with clustering results from dataset 11
1,536,898 lines
format: zip, database: ?
PK
13: Contigs from dataset 11 based on clustering
20,786 sequences
format: fasta, database: ?
>CL1Contig1 (101-2.0-197)
GCAATAAACAGCAAATGTTAATCAGGCAGACCATACAAACAACACAAGCACATAAATGCA
AGCTACAAGCTCAAGCTCAAGCTTCACAACCTACAAAACAA
>CL1Contig2 (100-1.7-172)
TTTAATCACACAGACAGACAAATCCAATCAAGCAACAATGAGCAAGGCACAAGCACAAGG
GAATTTGTTCTCAAAACCTACAAAACAACACAATGTTAGC
14: Log information of clustering based on dataset 11
14,250 lines
format: txt, database: ?
pipeline version:
d5204b9a1621 163 stable
This is clustering pipeline
GRAPH BASED CLUSTERING
**********************************************************************
15: HTML summary of graph based clustering on dataset 11
189.6 KB
format: html, database: ?
HTML file