Data Download
Feature Description
Download database sequences, statistics, and BLAST database files in various formats.
Download protein sequences in FASTA format. You can download the complete database or filter by functional system and/or species.
Complete Database
Download all sequences from the current database version (v1.2).
Format: FASTA (.faa)
Contains: All 728 sequences with headers and annotations
Filtered Download
Download sequences filtered by functional system and/or species.
Version History
v1.2 (Current)
728 sequences, aligned sequences, enhanced annotations
v1.1
907 sequences, 11 functional systems
Available in data/v1.1/ directory
Note: This version contains duplicate sequences
v1.0
509 sequences, 16 genes
Available in data/v1.0/ directory
Download database statistics and analysis results in CSV format.
Database Statistics
Comprehensive statistics including sequence counts, length distributions, and functional system breakdowns.
Function Counts
Counts of sequences per functional system.
Download the comprehensive gene function reference table in CSV or Excel format.
CSV Format
Comma-separated values format, compatible with most spreadsheet applications.
Excel Format
Microsoft Excel format (.xlsx) with formatting preserved.
Download the pre-built BLAST database for local sequence similarity searches.
Complete BLAST Database
The BLAST database is provided as a compressed tar.gz archive containing all necessary files for BLAST searches.
Usage Instructions
- Download and extract the tar.gz archive
- Use the database with BLAST+ tools:
blastp -query your_sequences.faa -db path/to/pH_resistance_db_final -out results.txt - For more information, see the Database Information page
Supported Formats
- FASTA (.faa, .fa, .fasta): Standard protein sequence format with headers
- CSV (.csv): Comma-separated values for spreadsheet applications
- Excel (.xlsx): Microsoft Excel format with formatting
- BLAST Database: Pre-built BLAST+ database (tar.gz archive)
Data Fields
Downloaded sequences include the following metadata:
- Sequence ID
- Gene name
- Species
- Functional system classification
- pH range
- Environment
- Description