Scope of Bioinformatics
Greetings
It is great honor for me that I am writing in a very great community Steem4Bloggers.i hope all will support and guide me being new one .
[source(https://pixabay.com/photos/computer-plate-page-electronics-5088593/)
Bioinformatics affects molecular biology, biochemistry and genetics on the one hand, and theoretical and practical computer science and computational linguistics on the other. It has a homogeneous and broad stock of open problems. It is becoming more and more important in biology and genetics and is already being used industrially.
Goals of bioinformatics
Proteins are complex molecules that are the building blocks of all life forms. The variety of proteins is enormous: for example, there are over a million different proteins in the human body. Proteins are composed of amino acids in a manner that is encoded in DNA (deoxyribose nucleic acid). DNA is a linear polymer made up of 4 nucleotides. A partial sequence of 3 nucleotides is called a codon. Each of the 20 amino acids is encoded by 1 to 6 of the 4³ = 64 codons, most in multiple ways. A goal of sequence analysis is to find the regions of DNA that encode a protein. This is very complex: there are many non-coding areas; the beginnings of coding regions and non-coding regions are difficult to discern; the coding regions are not necessarily connected. Computer methods are currently being combined with biochemical laboratory tests for sequence analysis.
[source(https://pixabay.com/photos/businessman-founding-financing-plan-3300891/)
DNA analysis and natural language processing have a lot in common: "encodings" are recognized from sequences; huge amounts of empirical data are available; Regularities are recognizable that have only been partially understood so far; the systematic recording of similarities is necessary; the boundary between well-formed and ill-formed strings is fluid. In both areas, stochastic methods such as stochastic grammars and hidden Markov models are used.
Predicting protein structure is a major goal in life sciences because a protein 's function depends on its 3-dimensional shape. Their complete solution would mean a scientific and economic revolution in medicine and pharmacy. In order to avoid difficult laboratory investigations, informatics methods are used for protein folding , which determine the structure of a protein from its amino acid sequence. A distinction is made between the primary , secondary, tertiary and quaternary (or quaternary) structures of a protein.
[source)(https://pixabay.com/photos/technology-informatics-computers-298256/)
The primary structure is the amino acid sequence. The secondary structure is an abstraction of the 3-dimensional shape in the form of local folding. Tertiary structure is the 3-dimensional shape of protein subunits. The quaternary structure reflects how the different subunits of a protein are spatially assembled. To date, the tertiary structures of only about 9,000 sequences are known. The homology-based prediction of the tertiary structure is based on a comparison of the sequence whose structure is to be determined with sequences whose tertiary structures are already known. In ab initio structure prediction tertiary structures with minimum free energy are determined by global optimization methods. The question of whether and how a protein can form a stable complex with other molecules is called the protein docking problem . Methods for 1:1 docking of single pairs of proteins provide relative positioning of the molecules to one another.
In 1:n docking, docking partners for a given protein are searched for in a molecular database. Methods of molecular dynamics, discrete techniques, genetic algorithms, geometric algorithms as well as data mining and knowledge discovery methods are used for 1:n docking and for homology-based structure prediction.
There are currently more than a hundred different databases in molecular biology: including DDBJ, EMBL, GenBank, PIR and SwissProt. Many of these databases are very large. GenBank contains e.g. B. approx. 4 x 106 nucleotide sequences, which consist of a total of approx. 3 x 1012 occurrences of nucleotides. The biodatabases are not based on a uniform scheme. The linking of heterogeneous biodatabases and the integration of schemas for biodatabases are largely unsolved problems of great economic importance.
[source(https://unsplash.com/photos/OqtafYT5kTw)
Evolution changes the proteins encoded in DNA over time. It is possible through computer-aided sequence analysis and classification to determine these changes and from them the family trees, called phylogenetic trees , of related species. The approach is used in evolutionary paleontology, among other things.
Recently, the computer-aided identification of metabolic and regulatory pathways has gained importance. A metabolic pathway is an abstract representation of a metabolic process that lists the proteins and molecules involved. A regulatory pathway represents the flow of information in a cell type, the misbehavior of which forms the basis for many diseases, such as cancer. In order to investigate the similarity of metabolic or regulatory pathways in different organisms, methods of pattern recognition, similarity searches in databases and sequence analysis are used.
Genes are regions of DNA that encode proteins and thereby determine hereditary characteristics. A gene can be "expressed" in cells of a certain type, ie "read" for protein development. One speaks of gene expression .
With a single DNA chip , the concentrations or expression levels from thousands to hundreds of thousands of genes expressed in a given cell type. In differential displays, the differences between the expression levels in healthy and diseased cells of the same cell type can be determined. The very extensive data obtained in this way are the starting point for new approaches to diagnosis and therapy that combine computer methods and biochemical laboratory investigations.
[source)(https://unsplash.com/photos/p-xSl33Wxyc)
Somewhat apart from bioinformatics, but related to it, is basic research on the subject of biocomputing . This describes the use of molecular biological methods such as gel electrophoresis and polymerase chain reaction to calculate complex mathematical problems such as cryptographic decryption. The chances of biocomputing being implemented are still completely open.
Thanks
You should at least comply with #club5050 to be eligible for support from @steemcurator05. We encourage you to power up.