Information report for 889097
Gene Details
Functional Annotation
- Refseq: XP_020869601.1 — uncharacterized protein LOC9326263, partial
- Swissprot: B5X561 — SHA_ARATH; SH2 domain-containing protein A
- TrEMBL: D7KEL3 — D7KEL3_ARALL; Uncharacterized protein
- STRING: scaffold_101891.1 — (Arabidopsis lyrata)
- GO:0008299 — Biological Process — isoprenoid biosynthetic process
- GO:0015979 — Biological Process — photosynthesis
- GO:0016765 — Molecular Function — transferase activity, transferring alkyl or aryl (other than methyl) groups
Family Introduction
- The CudA and ECudA proteins share a 120 amino acid core of homology, and clustered point mutations introduced into two highly conserved motifs within the ECudA core region decrease its specific DNA binding in vitro. This region, the presumptive DNA-binding domain, is similar in sequence to domains in two Arabidopsis proteins and one Oryza protein. Significantly, these are the only proteins in the two plant species that contain an SH2 domain. Such a structure, with a DNA-binding domain located upstream of an SH2 domain, suggests that the plant proteins are orthologous to metazoan STATs. Consistent with this notion, the DNA sequence of the CudA half site, GAA, is identical to metazoan STAT half sites, although the relative positions of the two halves of the dyad are reversed. These results define a hitherto unrecognised class of transcription factors and suggest a model for the evolution of STATs and their DNA-binding sites.
Literature and News
Gene Resources
- SuperFamily: SSF55550
- Gene3D: G3DSA:3.30.505.10
- PROSITE profile: PS50001
- InterPro: IPR000980
Sequences
CDS Sequence:
- >889097|Arabidopsis_lyrata|STAT|889097
ATGAGCGTTAGTGCACCTTTCCTTGTTCTAGATGAAAACAAGAAGATGATGCTTTTGCCATTGACTCTCCTTCACAATGAAGCTCCTGATCCTGTTAACATTTCTTCTTGGACTGAAGTTCCTAATGTTTCTACAACTGCTGAGTTTCCTCTTCAAAAATGGGTTCATGTGGGTTGCGAGGTTTCTAGAAACTACATGCGCCTTTATATTTGTGGAGAGATTGTAGGAGAGCAAGTCTTAACTTCCTTGATGACCAATAGTACAAATTCAGATTGCGCACGAAAGATATCGTTATTCAGTGTTGGTGGAGATGGTTATAGTGTTCAGGGTTTCATCCACTGTGCTGAAGTTTTGCCTTCTAATGTCCCGGCAAATTATCACTACACAAAGGACCCACCTTTATGGTTATCTGTTGATAAGCCATCTACATCTGGAATTGGATTAGATAAAGATGGTGTTTGGATTATCGTTAGCGGAACATTTTGTTCCTTGGATGTTGTTTTAACCAATGCTATTGGACAGCCTGTGCACAAGGATGTGAAGGTTGTGGCTTCTCTACTGTATGCTGATAGTGGGATGCCAGTTGAGAAGATGAGTGACTCGGAGGCTCCCCTTTTGGTAAGCTATGAAGGAGTTGAATTCTCTGCTGAAGATAAGCCATGCAACTTATTGAACGGATGTGCTTCTTTCAAGCTCAAATTATCTCAGCTTTCTTCCAAGAGTGATAAGAGATTGTTCTGTGTCAAATTCGAAATACCAGAAGTGAAGGCCTATTATCCTTTCCTTGAAACTGTTACCAACCAAATCCGTTGCATTTCGAGGAACCATGATTCTCTTACCCCTAAAAGGTCGAATCGCATAGATTATCCATTAGATGGAGGAGAACCAGAACTTGCTTCAAGAAGTAATGGAACATCCGACATTCTACATAGCTCTTCCTCTATGAAACGGATCAGATTAGGGGAAGAAAAGGTTTCTGAGAGTGAGACTGAGAATGGAAATGGTACGAGTATGGAATGGAGACCTCAGAACCATGAAGAAGAAGACGAAGAAGATAACTCTTCAACTGATTCCGAAAACACTGAAATTAGAGACTCGACTGCTTTCAGGAGATATACAATCTCAGACTCGATTATTTTCAAATACTGCCTTGGAAACTTAACAGAGAGAGCTCTTCTTCTGAAGGAAATCACAAATAATTCATCAGATGAAGAAGTCTCGGAATTTGTAGATCAAGTTTCTCTCTATTCCGGATGCTTCCACCACAGCTATCAAATCAAAATGGCAAGACAATTGATAGCAGAAGGAACAAATGCGTGGAATCTGATATCTCGGAACTATCAACATGTTCATTGGGACAATGTGGTAATTGAGATTGAAGAACATTTCATGAGAATAGCTAAATGCAGCAGTAGATCTCTCACTCACCAGGATTTTGACCTTCTAAGAAGAATATGTGGATGCTATGAATACATAACTCAAGAGAATTTTGAGAAAATGTGGTGTTGGTTGTTCCCTGTTGCTTCGGCTATATCCAGGGGATTGATTAACGGAATGTGGCGCTCAGCTTCGCCTAAATGGATAGAAGGGTTTGTGACTAAAGAAGAGGCAGAACGTTCGCTTCAGAATCAAGTAGCGGGAACTTTCATTCTCAGGTTCCCTACTTCAAGAAGCTGGCCACATCCTGATATTGTTGGTGCAGAAAATCCGGTTTTGATATCTGCGGCTGAGCAAATCTTCAGTGCTGGTGGCAAGAGGATGAGACCGGGTTTGGTATTCCTTGTATCACGAGCCACTGCGGAATTAGCTGGCTTAAAGGAACTTACAGTAGAACATCGGCGTTTAGGTGAGATCATTGAGATGATTCACACCGCAAGTTTGATACACGATGATGTGTTAGACGAAAGTGATATGCGAAGAGGAAAAGAAACGGTTCACGAGCTTTTCGGAACAAGAGTAGCTGTATTAGCTGGAGATTTCATGTTTGCTCAAGCATCATGGTACTTAGCAAATCTCGAAAACCTCCAAGTCATTAAGCTCATAAGTCAGGTAATCAAAGATTTTGCAAGCGGTGAGATAAAGCAAGCATCAAGTTTATTCGATTGTGATGTCGAGCTTGATGACTACTTGCTAAAGAGTTACTACAAGACAGCTTCATTAGTAGCTGCAAGCACCAAAGGAGCTGCAATTTTCAGTAAAGTCGAAAGCGAGGTCGCAGAGCAAATGTATCAGTTCGGAAAGAATCTCGGTTTATCTTTTCAAGTAGTTGATGACATTCTGGACTTCACTCAATCCACAGAGCAGTTAGGGAAACCTGCAGCTAATGACTTAGCCAAAGGTAACATAACAGCGCCAGTGATCTTCGCACTAGAGAATGAGCCGAGGCTAAGAGAGATCATTGAGTCTGAGTTTTGTGAGCCTGGATCGCTTGAAGAAGCGATTGAAATAGTTAGAAATCGCGGTGGGATCAAGAAAGCTCAAGAATTGGCTAAGGAGAAAGGTGAGCTTGCGTTAAAGAATCTGAATTGTCTTCCGAGAAGTGGTTTCAGATCGGCTCTTGAGGATATGGTGATGTTTAATCTTGAAAGGATTGATTAG
Protein Sequence:
- >889097|Arabidopsis_lyrata|STAT|889097
MSVSAPFLVLDENKKMMLLPLTLLHNEAPDPVNISSWTEVPNVSTTAEFPLQKWVHVGCEVSRNYMRLYICGEIVGEQVLTSLMTNSTNSDCARKISLFSVGGDGYSVQGFIHCAEVLPSNVPANYHYTKDPPLWLSVDKPSTSGIGLDKDGVWIIVSGTFCSLDVVLTNAIGQPVHKDVKVVASLLYADSGMPVEKMSDSEAPLLVSYEGVEFSAEDKPCNLLNGCASFKLKLSQLSSKSDKRLFCVKFEIPEVKAYYPFLETVTNQIRCISRNHDSLTPKRSNRIDYPLDGGEPELASRSNGTSDILHSSSSMKRIRLGEEKVSESETENGNGTSMEWRPQNHEEEDEEDNSSTDSENTEIRDSTAFRRYTISDSIIFKYCLGNLTERALLLKEITNNSSDEEVSEFVDQVSLYSGCFHHSYQIKMARQLIAEGTNAWNLISRNYQHVHWDNVVIEIEEHFMRIAKCSSRSLTHQDFDLLRRICGCYEYITQENFEKMWCWLFPVASAISRGLINGMWRSASPKWIEGFVTKEEAERSLQNQVAGTFILRFPTSRSWPHPDIVGAENPVLISAAEQIFSAGGKRMRPGLVFLVSRATAELAGLKELTVEHRRLGEIIEMIHTASLIHDDVLDESDMRRGKETVHELFGTRVAVLAGDFMFAQASWYLANLENLQVIKLISQVIKDFASGEIKQASSLFDCDVELDDYLLKSYYKTASLVAASTKGAAIFSKVESEVAEQMYQFGKNLGLSFQVVDDILDFTQSTEQLGKPAANDLAKGNITAPVIFALENEPRLREIIESEFCEPGSLEEAIEIVRNRGGIKKAQELAKEKGELALKNLNCLPRSGFRSALEDMVMFNLERID*