Information report for Thecc1EG034172t1
Gene Details
|
|
Functional Annotation
- Refseq: XP_007017723.1 — PREDICTED: uncharacterized protein LOC18591502
- TrEMBL: A0A061FDF4 — A0A061FDF4_THECC; Basic-leucine zipper transcription factor family protein, putative isoform 1
- STRING: EOY14948 — (Theobroma cacao)
- GO:0006355 — Biological Process — regulation of transcription, DNA-templated
- GO:0003700 — Molecular Function — transcription factor activity, sequence-specific DNA binding
- GO:0043565 — Molecular Function — sequence-specific DNA binding
Family Introduction
- The bZIP domain consists of two structural features located on a contiguous alpha-helix: first, a basic region of ~ 16 amino acid residues containing a nuclear localization signal followed by an invariant N-x7-R/K motif that contacts the DNA; and, second, a heptad repeat of leucines or other bulky hydrophobic amino acids positioned exactly nine amino acids towards the C-terminus, creating an amphipathic helix. To bind DNA, two subunits adhere via interactions between the hydrophobic sides of their helices, which creates a superimposing coiled-coil structure. The ability to form homo- and heterodimers is influenced by the electrostatic attraction and repulsion of polar residues flanking the hydrophobic interaction surface of the helices.
- Plant bZIP proteins preferentially bind to DNA sequences with an ACGT core. Binding specificity is regulated by flanking nucleotides. Plant bZIPs preferentially bind to the A-box (TACGTA), C-box (GACGTC) and G-box (CACGTG), but there are also examples of nonpalindromic binding sites.
Literature and News
Gene Resources
Homologs
- Gossypium hirsutum: Gh_D02G0972, Gh_A03G2097
- Juglans regia: WALNUT_00012094-RA
- Nicotiana tabacum: XP_016480187.1, XP_016480186.1, XP_016480185.1, XP_016442558.1
- Prunus mume: XP_008221306.1
- Prunus persica: Prupe.1G369300.2.p, Prupe.1G369300.1.p
- Sesamum indicum: XP_011088546.1, XP_011088545.1
Sequences
CDS Sequence:
- >Thecc1EG034172t1|Theobroma_cacao|bZIP|Thecc1EG034172t1
ATGGAGGGTATTAGTGAGAGTAGAAGTAATATGCAGAAGCTTCAATCACCACCGTCGAATTCCATCCCTAAGCCACAAAGCAACTTAGATATACCCATTTTCAATGCCTCTCAAATGGCTCCTTCTCCTCATACGCGTTTAAGTCCTGAGAACAATAACAACAAAAGACCCGGGATACCTCCTTCACACCCCAATTACCCCGCTGCTACATCGCCTTACTCACAGATCATTGGTTCTCGTTCCAATTCTCAACAAGGGGCACCGTCTCATTCCAGGTCTCTATCTCAACCCACATTCTTTTCCCTTGATAGCTTGCCCCCATGGAGTCCTCCCCCTTATCGAGAGCCGTCTGTCGCGTCTCTGTCTGATCCTGCTTCCAATGATGTTTCTATGGAAGAAAGGGTAGTCAATTCTAATGTTAGGTCCTCACTTCCTTCACCTGTTGCTAGAGGAGTTAACGAGTTTCGTGTCGGCGAGAGTTCGAGTTTGCCTCCGCGTAAAGGACATAGGCGGTCTAGCAGTGATGTTCCACTAGGATTTTCTGCTATGATTCAGTCTTCGCCTCAATTGCTTCCCATAGGGAGTCGTGGCGTGTTGGATAGGTCAGTTTCGGGTAGGGAGAGTTCTTCTGGTGTGGAGAAACCGATTCAGCTGGTGAAACGAGAATCAGAATGGAGCAAGGATGGGAGTAGTAATGTAGAAGGGATGAGTGAAAGGAAATCCGAGGGGGATGTTGCTGATGACTTGTTCAATGCGTACATGAATTTGGACAGTCTTGAGACATTGAACTCTTCTGGAACTGAGGATAAGGATTTGGATAGCAGAGCAAGTGGCACAAAGACATATGGAGGTGAAAGTAGTGACAATGAAGTGGAGAGTAGAGTAAATGGACATCCAATTAGTATGCAGGGAATGAGTGCTGGTGCTTCAAATGAGAAGGGGGTCAAGAGGAGTGCTGGTGGAGATATTGCTCCCACTGCTCGGCATCATAGAAGTGTTTCAATGGATAGTTACATGGGAAGTCTGCAATTTGATGACGAATCATCAAAGATTCCTCCTGGTAGCTCAGTGGATGCAAATTCGGGCAAGTTCAACCTTGAGCTTGGGAGTAGTGAGTTCAGTGAAGCTGAGATGAAAAAGATCATGGAAAATGAGAAGCTTGCTGAGATTGCTTCTGTAGACCCAAAACGCGCCAAAAGGATTTTGGCTAATCGTCAATCAGCTGCTCGTTCAAAGGAGCGGAAGATGCGGTACATTGCAGAATTAGAACACAAGGTGCAAACTCTGCAAACAGAGGCAACCACATTATCTGCACAGCTAACAATGTTACAGAGAGACTCTGCTGGGCTTACTAGTCAGAACAATGAGTTGAAATTTCGTCTTCAGGCCATGGAGCAACAGGCCCAACTGAAAGATGCACTAAACGAAGCATTAGCTGCCGAAGTCCAGCGACTGAAGGTTACTGCAGCAGAGCTCAGTGGGGAGGCTCATCTTTCAAGCTGCATGGCTCAGCAGCTTTCGCTGAATCATCCGATGTTCCAATTACAGCCTCAGCAACCTCAGCAGGTGAATGTTTATCAGATGCAGCAACAGCAGCAACACCAACAGCCACAGCACAGTCAGCACAACCAGTTGCAGACCCAGCAGCAACAGAATGATGACCCTACTGCAAATGAATCTAAGTGA
Protein Sequence:
- >Thecc1EG034172t1|Theobroma_cacao|bZIP|Thecc1EG034172t1
MEGISESRSNMQKLQSPPSNSIPKPQSNLDIPIFNASQMAPSPHTRLSPENNNNKRPGIPPSHPNYPAATSPYSQIIGSRSNSQQGAPSHSRSLSQPTFFSLDSLPPWSPPPYREPSVASLSDPASNDVSMEERVVNSNVRSSLPSPVARGVNEFRVGESSSLPPRKGHRRSSSDVPLGFSAMIQSSPQLLPIGSRGVLDRSVSGRESSSGVEKPIQLVKRESEWSKDGSSNVEGMSERKSEGDVADDLFNAYMNLDSLETLNSSGTEDKDLDSRASGTKTYGGESSDNEVESRVNGHPISMQGMSAGASNEKGVKRSAGGDIAPTARHHRSVSMDSYMGSLQFDDESSKIPPGSSVDANSGKFNLELGSSEFSEAEMKKIMENEKLAEIASVDPKRAKRILANRQSAARSKERKMRYIAELEHKVQTLQTEATTLSAQLTMLQRDSAGLTSQNNELKFRLQAMEQQAQLKDALNEALAAEVQRLKVTAAELSGEAHLSSCMAQQLSLNHPMFQLQPQQPQQVNVYQMQQQQQHQQPQHSQHNQLQTQQQQNDDPTANESK*