Figure 1 Flow chart of ACGC.
ACGC is composed of atomic adjacent group (AAG), shape factor (SF) and
atomic connectivity factor (ACF). The working process of ACGC is
described in Figure 1. AAG is a systematic group definition approach
that explicitly decomposes each molecular structure into a set of
non-overlapping functional groups based on the relationship between core
and adjacent atoms. SF is used to calculate the effect of molecular
shape. ACF is used to calculate topological position factors by atomic
properties to describe the positions of groups.
We also analyze and evaluate the
model by external verification, internal verification and
Y-randomization test31.
Atomic adjacent group (AAG)
The traditional group contribution method requires a higher level of
groups to make a more accurate division, and the division method is
complicated. The AAG is proposed by
the atomic adjacent relationship. Atoms are classified into two types:
endpoint atom and connection
atom. The atom connected with only one non-hydrogen atom is defined as
endpoint atom. The atom connected with two or more non-hydrogen atoms is
defined as connection atom. A group consists of core atom, adjacent
atoms and bond types, which are shown in Figure 2. Core atom is a
connection atom, which neighbors with two or more non-hydrogen atoms.
Core atoms include carbon, oxygen, nitrogen, silicon, sulfur, phosphorus
and so on. Atoms connected to core atom include endpoint atoms and
connection atoms. Endpoint atom is described in the parenthesis, ’()’.
Connection atom is described in the bracket, ’[]’. Bond types
between the core atom and adjacent atom are presented before the
adjacent atoms. The single bond, double bond, triple bond of linear
structure is described as -, =, ≡, respectively. The single bond, double
bond, triple bond of cyclic structure and aromatic bond are described by
~,≈,≋ and ∷, respectively. Four examples are used to
describe the group definition rules in Table S1 of Supporting
Information (contribution-coefficient.docx).