None
Published Histories | jirka | Example history #1
Import history

Galaxy History ' Example history #1'

Annotation: Clustering analysis of a small sample dataset of 454 reads from rye genome, followed by identification and phylogenetic analysis of retrotransposon RT domains in assembled contigs.

DatasetAnnotation
1: rye 4B, randomly sampled 200,000 reads (FASTA)
200,000 sequences
format: fasta, database: ?
>1
ACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACGACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACGACACTACACACACACACACACACACACACTACACACACACACACACACACACACACACACACACACACACACACACACACACACTACTACTACGTACGTACG
TACTACGTACGTACTAAC
>2
CACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACA
CACACACACACACACACACACGACGACACACACGACGACGACGACGACGACG
Sample dataset prepared from rye (Secale cereale) 454 sequencing run EBI SRA ERR058831. Reads were quality-filtered, converted from FASTQ to FASTA format and randomly sampled to obtain 200,000 sequences.
2: Archive with clustering results from rye 4B, randomly sampled 200,000 reads (FASTA)
1,945,579 lines
format: zip, database: ?
Info:
binary/unknown file
A zip archive including multiple folders and files representing main output of the analysis. It should be downloaded and unzipped in order to view its content.
3: Contigs from rye 4B, randomly sampled 200,000 reads (FASTA) based on clustering
9,729 sequences
format: fasta, database: ?
Info:
>CL1Contig1 (815-1.9-1535)
TTCCCCGACATCTTTGGCGCGCCAGGTAGGGGGGTGCGTCGAGATTGTGTGAACCCGATC
CGGCGTCCACACGAGCCAGATCTTCATCGTCTTCATCAACATGCCGCCGAAGAAGAAGGC
TTTGGAGGTAGCTGGACCGTCCGCGTCGATCCCACCACCGCCAGAGCCAACGGCTGGTGG
GGTAGACGCCGGAAGAAGAACGGACATCGACGAGGGAACTCACGGTGCTGCCAGGTCCAA
GGGCAAGGCTAGCGTAGCCGGCTCCCACGACGACCGGGCACACTCCAAGACCTCGGCATC
Contig seqences obtained by assembling reads within individual clusters.
4: Log file from (from rye 4B, randomly sampled 200,000 reads (FASTA)
6,878 lines
format: txt, database: ?
Info:
True
This is clustering pipeline
GRAPH BASED CLUSTERING
**********************************************************************
Data preparation started:
5: HTML summary of graph based clustering of rye 4B, randomly sampled 200,000 reads (FASTA)
76.1 KB
format: html, database: ?
Info:
HTML file
Summary of various information for the top clusters.
6: output fasta and domain RT
219,544 lines
format: fasty36, database: ?
fasty36 -m 3 -m9 -b 1 -H -q -O output.part.0 - /home/galaxy/galaxy-dist/tool-data/domains/TE_domains_newest_RT
FASTY compares a DNA sequence to a protein sequence data bank version 36.06 Sep, 2010
Please cite:
Pearson et al, Genomics (1997) 46:24-36
Query: CL1Contig1, 815 aa
Raw output of fasty36 search performed on assembled contigs using "Protein domain search" tool. The contigs were searched for similarity to the reference database of retrotransposon reverse transcriptase (RT) domains.
7: fasta output Ty1 (from-output fasta and domain RT, with sensitive: c-60 d-40 s-0.8 r-3)
45 sequences
format: fasta, database: ?
Info: keys: Ty1-RT__Wicker_Osr1_Os_cons

Ty1-RT__Wicker_Barbara_A_210J24-3

Ty1-RT__Wicker_Inga_AY661558-2

Ty1-RT__Wicker_Angela_A_cons

Ty1-RT__Wicker_Inga_AY661558-2

Ty1-RT__Wicker_BARE1_B_cons

Ty1-RT__Wicker_Angela_A_cons

Ty1-RT__Wic
>CL27Contig4_592-1236
WIQAMQEELQQFELNNVWELVKRPDPRKHNIIGTKWIYRNKQDEHGQVVR
NKARLVAQGYTQVEGIDFDETFAPVARLEAIRILLAYANHHNIFLYQMDV
KSAFLNGKIEEEVYVAQPPGFEDPKRPDMVYKLNKALYGLKQAPRAWYDT
LKDFLKSKGFKPGSLDPTLFTKTYDGELFVCQIYVDDIIFGCTVKRYSYE
FGYMMQVQYQMSMMG
Sequences of identified Ty1/copia RT domains selected from the raw fasty36 output (see filtration parameters). Sequences from the reference database with best hits to contig sequences are also provided. This file can be downloaded are used for further analysis on your local computer, or used as an input to "Create tree" tool which performs multiple sequence alignment and tree construction.
8: fasta output Ty3 (from-output fasta and domain RT, with: c-60 d-40 s-0.8 r-3)
106 sequences
format: fasta, database: ?
Info: keys: Ty1-RT__Wicker_Osr1_Os_cons

Ty1-RT__Wicker_Barbara_A_210J24-3

Ty1-RT__Wicker_Inga_AY661558-2

Ty1-RT__Wicker_Angela_A_cons

Ty1-RT__Wicker_Inga_AY661558-2

Ty1-RT__Wicker_BARE1_B_cons

Ty1-RT__Wicker_Angela_A_cons

Ty1-RT__Wic
>CL1Contig12_3035-3556
VRGVIHPTWLANPVVVRKANGKWRLCIDFTDVNKACPKDPFPLPRIDQIV
DSTAGCELLSFLDAYSGYHQIFMAKEDEEKTAFITPCGTYCFIRMPFGLK
NAGSTFARVVYTAFEPQIHRNVEAYMDDIVVKSKSKETLIQDLEETFQNL
RRIQLKLNPEKCVFGVPSGKLLGF
>CL1Contig13_3060-3581
The same as item 7: but performed for Ty3/gypsy RT domains. This file was used as input to the "Create tree" tool, generating history items 9-12.
9: tabular output (from-output fasta and domain RT, with c-60 d-40 s-0.8 r-3)
131 lines
format: tabular, database: ?
Info: keys: Ty1-RT__Wicker_Osr1_Os_cons

Ty1-RT__Wicker_Barbara_A_210J24-3

Ty1-RT__Wicker_Inga_AY661558-2

Ty1-RT__Wicker_Angela_A_cons

Ty1-RT__Wicker_Inga_AY661558-2

Ty1-RT__Wicker_BARE1_B_cons

Ty1-RT__Wicker_Angela_A_cons

Ty1-RT__Wicker_WIS_B_cons
1234567891011121314151617181920212223242526
Type lineage query_name query_length hit_name hit_length strand opts bits e_val perc_id perc_sim sw alen an0 ax0 pn0 px0 an1 ax1 pn1 px1 gapq gapl fs
RIRE2_ITy3/gypsyOgre/TatCL1Contig124659Ty3-RT__RIRE2_I174f878243.62.2e-650.6780.874878174303535561465911741174000
RIRE2_I.1Ty3/gypsyOgre/TatCL1Contig134492Ty3-RT__RIRE2_I174f8942481.1e-660.690.885894174306035811449211741174000
RIRE2_I.2Ty3/gypsyOgre/TatCL1Contig235275Ty3-RT__RIRE2_I174f878243.72.5e-650.6780.874878174330138221527511741174000
RIRE2_I.3Ty3/gypsyOgre/TatCL1Contig263906Ty3-RT__RIRE2_I174f897248.85.3e-670.6950.885897174289134121390611741174000
RIRE2_I.4Ty3/gypsyOgre/TatCL1Contig27663Ty3-RT__RIRE2_I174r840233.34.1e-630.6780.87984017458461663111741174002
10: Aligment output (from-fasta output Ty3 (from-output fasta and domain RT, with: c-60 d-40 s-0.8 r-3), with distance method-0)
678 lines
format: alignment, database: ?
PileUp
MSF: 198 Type: A Check: 0000 ..
Name: CL3472Contig1_495-48 Len: 198 Check: 9509 Weight: 0.00257114
Name: Ty3-RT__Reina_ZeaM_ID49 Len: 198 Check: 2348 Weight: 0.00219491
Multiple sequence alignment used to calculate the tree. The alignment was generated using Muscle program and saved in MSF format.
11: Newick output (from-fasta output Ty3 (from-output fasta and domain RT, with: c-60 d-40 s-0.8 r-3), with distance method-0)
1 line
format: newick, database: ?
((((((CL22Contig45_1536-2057:1.582044752,CL22Contig29_1387-1908:0.137955248):2.053871496,CL22Contig42_1380-1901:0.2461285037):0.4927109697,CL22Contig59_300-821:0.6572890303):0.2022996261,CL22Contig28_1370-1874:5.607700374):0.5295751272,(CL22Contig4_3387-29
08:2.948046482,CL22Contig52_224-740:5.181953518):1.475737373):1.369706055,((((((CL3886Contig1_248-691:9.167816577,CL22Contig15_104-556:7.732183423):1.170535244,((((((((CL145Contig3_508-1030:4.099745577,CL145Contig5_167-688:1.070254423):1.59827643,CL1086Con
tig1_216-737:4.43672357):12.22750581,(CL86Contig2_3068-2547:0.8382326434,CL86Contig3_981-1421:4.601767357):15.95374419):1.398933321,(((((((CL483Contig1_742-221:2.192095571,CL48Contig11_3010-3531:1.827904429):1.034921588,CL942Contig1_549-1023:8.665078412):0
.9469016104,CL48Contig21_785-272:7.99059839):3.095032985,CL1368Contig1_1-441:9.499967015):0.2992858347,(((((CL1Contig26_2891-3412:0,CL1Contig33_2030-2551:0):0.254273306,CL1Contig13_3060-3581:0.315726694):0.7217059939,(CL1Contig23_3301-3822:0.3119351248,CL1
Contig27_584-61:0.2580648752):1.005794006):0.773911109,CL1Contig38_1592-2113:3.824838891):0.1900938936,(((CL1Contig39_1020-514:0.7109930526,CL1Contig51_2075-1554:-0.1209930526):0.3061135159,CL1Contig31_2494-3015:0.2738864841):0.9670053843,(CL1Contig12_3035
-3556:0.03863205501,CL1Contig30_1298-1819:0.531367945):1.062994616):0.8234998564):4.279112603):7.870060161,((((CL2331Contig1_3-525:7.324228841,CL999Contig1_343-862:5.395771159):1.270599902,CL1828Contig1_767-246:9.084400098):1.782835962,CL142Contig3_1078-16
Resulting tree in Newick format.
12: Neighbor joining tree (from-fasta output Ty3 (from-output fasta and domain RT, with: c-60 d-40 s-0.8 r-3), with distance method-0)
60.2 KB
format: html, database: ?
HTML file
A summary including image of the tree, list of reference sequences and multiple alignment.