| << Previous: Introduction | Up: Table of Contents | Next: SMILES Input >> |
This SMILES specification is divided into two distinct parts: A syntactic specification specifies how the atoms, bonds, parentheses, digits and so forth are represented, and a semantic specification that describes how those symbols are interpreted as a sensible molecule. For example, the syntax specifies how ring-closure digits are written, but the semantics require that they come in pairs. Likewise, the syntax specifies how aromatic elements are written, but the semantics determines whether a particular ring system is actually aromatic.
For this specification, the syntax and semantics are explained separately; in practice, the syntax and semantics are usually mixed together in the code that implements a SMILES parser. This chapter is only concerned with syntax.
| Section | Formal Grammar |
| ATOMS | |
| 3.1 | atom ::= bracket_atom | aliphatic_organic | aromatic_organic | '*' |
| ORGANIC SUBSET ATOMS | |
| 3.1.5 | aliphatic_organic ::= 'B' | 'C' | 'N' | 'O' | 'S' | 'P' | 'F' | 'Cl' | 'Br' | 'I' |
| 3.5 | aromatic_organic ::= 'b' | 'c' | 'n' | 'o' | 's' | 'p' |
| BRACKET ATOMS | |
| 3.1.1 | bracket_atom ::= '[' isotope? symbol chiral? hcount? charge? class? ']' |
| 3.1.1 | symbol := element_symbols | aromatic_symbols | '*' |
| 3.1.4 | isotope ::= NUMBER |
| 3.1.1 | element_symbols ::= 'H'| 'He' |'Li'|'Be'| 'B' |'C' |'N' |'O' |'F' |'Ne' |'Na'|'Mg'| 'Al'|'Si'|'P' |'S' |'Cl'|'Ar' |'K' |'Ca'|'Sc'|'Ti'|'V' |'Cr'|'Mn'|'Fe'|'Co'|'Ni'|'Cu'|'Zn'|'Ga'|'Ge'|'As'|'Se'|'Br'|'Kr' |'Rb'|'Sr'|'Y' |'Zr'|'Nb'|'Mo'|'Tc'|'Ru'|'Rh'|'Pd'|'Ag'|'Cd'|'In'|'Sn'|'Sb'|'Te'|'I' |'Xe' |'Cs'|'Ba'| 'Hf'|'Ta'|'W' |'Re'|'Os'|'Ir'|'Pt'|'Au'|'Hg'|'Tl'|'Pb'|'Bi'|'Po'|'At'|'Rn' |'Fr'|'Ra'| 'Rf'|'Db'|'Sg'|'Bh'|'Hs'|'Mt'|'Ds'|'Rg' |'La'|'Ce'|'Pr'|'Nd'|'Pm'|'Sm'|'Eu'|'Gd'|'Tb'|'Dy'|'Ho'|'Er'|'Tm'|'Yb'|'Lu' |'Ac'|'Th'|'Pa'|'U' |'Np'|'Pu'|'Am'|'Cm'|'Bk'|'Cf'|'Es'|'Fm'|'Md'|'No'|'Lr' |
| 3.5 | aromatic_symbols ::= 'c' | 'n' | 'o' | 'p' | 's' | 'se' | 'as' |
| CHIRALITY | |
| 3.9 | chiral ::= '@' | '@@' | '@TH1' | '@TH2' | '@AL1' | '@AL2' | '@SP1' | '@SP2' | '@SP3' | '@TB1' | '@TB2' | '@TB3' | ... | '@TB29' | '@TB30' | '@OH1' | '@OH2' | '@OH3' | ... | '@OH29' | '@OH30' |
| HYDROGENS | |
| 3.1.2 | hcount ::= 'H' | 'H' DIGIT |
| CHARGE | |
| 3.1.3 | charge ::= '-' | '-' DIGIT | '+' | '+' DIGIT | '--' *deprecated* | '++' *deprecated* |
| ATOM CLASS | |
| 3.1.7 | class ::= ':' NUMBER |
| BONDS AND CHAINS | |
| 3.2, 3.9.3 | bond ::= '-' | '=' | '#' | '$' | ':' | '/' | '\' |
| 3.4 | ringbond ::= bond? DIGIT | bond? '%' DIGIT DIGIT |
| 3.3 | branched_atom ::= atom ringbond* branch* |
| branch ::= '(' chain ')' | '(' bond chain ')' | '(' dot chain ')' | |
| chain ::= branched_atom | chain branched_atom | chain bond branched_atom | chain dot branched_atom | |
| 3.7 | dot ::= '.' |
| SMILES STRINGS | |
| 3.10 | smiles ::= chain terminator |
| terminator ::= SPACE TAB | LINEFEED | CARRIAGE_RETURN | END_OF_STRING |
| << Previous: Introduction | Up: Table of Contents | Next: SMILES Input >> |
Copyright © 2007
Andrew Dalke, Craig A. James
Content is available under GNU Free Documentation License 1.2