Bulk data access¶
This tutorial explains how to retrieve full tables from the database into pandas DataFrames.
The following tables are available from mendeleev
¶
elements
ionicradii
ionizationenergies
oxidationstates
groups
series
isotopes
All data is stored in a sqlite database that is shipped together with the package. You can interact directly with the database if you need more flexibility but for convenience mendeleev
provides a few functions in the fetch
module to retrieve data.
To fetch whole tables you can use fetch_table
. The function can be imported from mendeleev.fetch
[1]:
from mendeleev.fetch import fetch_table
To retrieve a table call the fetch_table
with the table name as argument. Here we’ll get probably the most important table elements
with basis data on each element
[3]:
ptable = fetch_table('elements')
Now we can use pandas’ capabilities to work with the data.
[4]:
ptable.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 118 entries, 0 to 117
Data columns (total 70 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 annotation 118 non-null object
1 atomic_number 118 non-null int64
2 atomic_radius 90 non-null float64
3 atomic_volume 91 non-null float64
4 block 118 non-null object
5 boiling_point 96 non-null float64
6 density 95 non-null float64
7 description 109 non-null object
8 dipole_polarizability 117 non-null float64
9 electron_affinity 77 non-null float64
10 electronic_configuration 118 non-null object
11 evaporation_heat 88 non-null float64
12 fusion_heat 75 non-null float64
13 group_id 90 non-null float64
14 lattice_constant 87 non-null float64
15 lattice_structure 91 non-null object
16 melting_point 100 non-null float64
17 name 118 non-null object
18 period 118 non-null int64
19 series_id 118 non-null int64
20 specific_heat 81 non-null float64
21 symbol 118 non-null object
22 thermal_conductivity 66 non-null float64
23 vdw_radius 103 non-null float64
24 covalent_radius_cordero 96 non-null float64
25 covalent_radius_pyykko 118 non-null float64
26 en_pauling 85 non-null float64
27 en_allen 71 non-null float64
28 jmol_color 109 non-null object
29 cpk_color 103 non-null object
30 proton_affinity 32 non-null float64
31 gas_basicity 32 non-null float64
32 heat_of_formation 89 non-null float64
33 c6 43 non-null float64
34 covalent_radius_bragg 37 non-null float64
35 vdw_radius_bondi 28 non-null float64
36 vdw_radius_truhlar 16 non-null float64
37 vdw_radius_rt 9 non-null float64
38 vdw_radius_batsanov 65 non-null float64
39 vdw_radius_dreiding 21 non-null float64
40 vdw_radius_uff 103 non-null float64
41 vdw_radius_mm3 94 non-null float64
42 abundance_crust 88 non-null float64
43 abundance_sea 81 non-null float64
44 molcas_gv_color 103 non-null object
45 en_ghosh 103 non-null float64
46 vdw_radius_alvarez 94 non-null float64
47 c6_gb 86 non-null float64
48 atomic_weight 118 non-null float64
49 atomic_weight_uncertainty 74 non-null float64
50 is_monoisotopic 21 non-null object
51 is_radioactive 118 non-null bool
52 cas 118 non-null object
53 atomic_radius_rahm 96 non-null float64
54 geochemical_class 76 non-null object
55 goldschmidt_class 118 non-null object
56 metallic_radius 56 non-null float64
57 metallic_radius_c12 63 non-null float64
58 covalent_radius_pyykko_double 108 non-null float64
59 covalent_radius_pyykko_triple 80 non-null float64
60 discoverers 118 non-null object
61 discovery_year 105 non-null float64
62 discovery_location 106 non-null object
63 name_origin 118 non-null object
64 sources 118 non-null object
65 uses 112 non-null object
66 mendeleev_number 118 non-null int64
67 dipole_polarizability_unc 117 non-null float64
68 pettifor_number 103 non-null float64
69 glawe_number 103 non-null float64
dtypes: bool(1), float64(46), int64(4), object(19)
memory usage: 63.8+ KB
For clarity let’s take only a subset of columns
[5]:
cols = ['atomic_number', 'symbol', 'atomic_radius', 'en_pauling', 'block', 'vdw_radius_mm3']
[6]:
ptable[cols].head()
[6]:
atomic_number | symbol | atomic_radius | en_pauling | block | vdw_radius_mm3 | |
---|---|---|---|---|---|---|
0 | 1 | H | 25.0 | 2.20 | s | 162.0 |
1 | 2 | He | 120.0 | NaN | s | 153.0 |
2 | 3 | Li | 145.0 | 0.98 | s | 255.0 |
3 | 4 | Be | 105.0 | 1.57 | s | 223.0 |
4 | 5 | B | 85.0 | 2.04 | p | 215.0 |
It is quite easy now to get descriptive statistics on the data.
[7]:
ptable[cols].describe()
[7]:
atomic_number | atomic_radius | en_pauling | vdw_radius_mm3 | |
---|---|---|---|---|
count | 118.000000 | 90.000000 | 85.000000 | 94.000000 |
mean | 59.500000 | 149.844444 | 1.748588 | 248.468085 |
std | 34.207699 | 40.079110 | 0.634442 | 36.017828 |
min | 1.000000 | 25.000000 | 0.700000 | 153.000000 |
25% | 30.250000 | 135.000000 | 1.240000 | 229.000000 |
50% | 59.500000 | 145.000000 | 1.700000 | 244.000000 |
75% | 88.750000 | 178.750000 | 2.160000 | 269.250000 |
max | 118.000000 | 260.000000 | 3.980000 | 364.000000 |
Isotopes table¶
Let try and retrieve another table, namely isotopes
[8]:
isotopes = fetch_table('isotopes', index_col='id')
[9]:
isotopes.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 406 entries, 1 to 406
Data columns (total 11 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 atomic_number 406 non-null int64
1 mass 377 non-null float64
2 abundance 288 non-null float64
3 mass_number 406 non-null int64
4 mass_uncertainty 377 non-null float64
5 is_radioactive 406 non-null bool
6 half_life 121 non-null float64
7 half_life_unit 85 non-null object
8 spin 323 non-null float64
9 g_factor 323 non-null float64
10 quadrupole_moment 320 non-null float64
dtypes: bool(1), float64(7), int64(2), object(1)
memory usage: 35.3+ KB
Merge the elements table with the isotopes¶
We can now perform SQL-like merge operation on two DataFrame
s and produce an outer join
[10]:
import pandas as pd
[11]:
merged = pd.merge(ptable[cols], isotopes, how='outer', on='atomic_number')
now we have the following columns in the merged
DataFrame
[12]:
merged.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 406 entries, 0 to 405
Data columns (total 16 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 atomic_number 406 non-null int64
1 symbol 406 non-null object
2 atomic_radius 328 non-null float64
3 en_pauling 313 non-null float64
4 block 406 non-null object
5 vdw_radius_mm3 350 non-null float64
6 mass 377 non-null float64
7 abundance 288 non-null float64
8 mass_number 406 non-null int64
9 mass_uncertainty 377 non-null float64
10 is_radioactive 406 non-null bool
11 half_life 121 non-null float64
12 half_life_unit 85 non-null object
13 spin 323 non-null float64
14 g_factor 323 non-null float64
15 quadrupole_moment 320 non-null float64
dtypes: bool(1), float64(10), int64(2), object(3)
memory usage: 51.1+ KB
[13]:
merged.head()
[13]:
atomic_number | symbol | atomic_radius | en_pauling | block | vdw_radius_mm3 | mass | abundance | mass_number | mass_uncertainty | is_radioactive | half_life | half_life_unit | spin | g_factor | quadrupole_moment | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | H | 25.0 | 2.2 | s | 162.0 | 1.007825 | 0.999720 | 1 | 6.000000e-10 | False | NaN | None | 0.5 | 5.585695 | 0.00000 |
1 | 1 | H | 25.0 | 2.2 | s | 162.0 | 2.014102 | 0.000280 | 2 | 8.000000e-10 | False | NaN | None | 1.0 | 0.857438 | 0.00286 |
2 | 1 | H | 25.0 | 2.2 | s | 162.0 | NaN | NaN | 3 | NaN | True | NaN | None | 0.5 | 5.957994 | 0.00000 |
3 | 2 | He | 120.0 | NaN | s | 153.0 | 3.016029 | 0.000002 | 3 | 2.000000e-08 | False | NaN | None | 0.5 | -4.254995 | 0.00000 |
4 | 2 | He | 120.0 | NaN | s | 153.0 | 4.002603 | 0.999998 | 4 | 4.000000e-10 | False | NaN | None | 0.0 | 0.000000 | 0.00000 |
To display all the isotopes of Silicon
[14]:
merged[merged['symbol'] == 'Si']
[14]:
atomic_number | symbol | atomic_radius | en_pauling | block | vdw_radius_mm3 | mass | abundance | mass_number | mass_uncertainty | is_radioactive | half_life | half_life_unit | spin | g_factor | quadrupole_moment | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
28 | 14 | Si | 110.0 | 1.9 | p | 229.0 | 27.976927 | 0.92191 | 28 | 3.000000e-09 | False | NaN | None | 0.0 | 0.00000 | 0.0 |
29 | 14 | Si | 110.0 | 1.9 | p | 229.0 | 28.976495 | 0.04699 | 29 | 3.000000e-09 | False | NaN | None | 0.5 | -1.11058 | 0.0 |
30 | 14 | Si | 110.0 | 1.9 | p | 229.0 | 29.973770 | 0.03110 | 30 | 2.000000e-08 | False | NaN | None | 0.0 | 0.00000 | 0.0 |
Ionic radii¶
The function to fetch ionic radii is called fetch_ionic_radii
and can either fetch ionic or crystal radii depending on the radius
argument.
[2]:
from mendeleev.fetch import fetch_ionic_radii
[3]:
irs = fetch_ionic_radii(radius="ionic_radius")
irs.head(10)
[3]:
coordination | I | II | III | IIIPY | IV | IVPY | IVSQ | IX | V | VI | VII | VIII | X | XI | XII | XIV | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
atomic_number | charge | ||||||||||||||||
1 | 1 | -38.0 | -18.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
3 | 1 | NaN | NaN | NaN | NaN | 59.0 | NaN | NaN | NaN | NaN | 76.0 | NaN | 92.0 | NaN | NaN | NaN | NaN |
4 | 2 | NaN | NaN | 16.0 | NaN | 27.0 | NaN | NaN | NaN | NaN | 45.0 | NaN | NaN | NaN | NaN | NaN | NaN |
5 | 3 | NaN | NaN | 1.0 | NaN | 11.0 | NaN | NaN | NaN | NaN | 27.0 | NaN | NaN | NaN | NaN | NaN | NaN |
6 | 4 | NaN | NaN | -8.0 | NaN | 15.0 | NaN | NaN | NaN | NaN | 16.0 | NaN | NaN | NaN | NaN | NaN | NaN |
7 | -3 | NaN | NaN | NaN | NaN | 146.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
3 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 16.0 | NaN | NaN | NaN | NaN | NaN | NaN | |
5 | NaN | NaN | -10.4 | NaN | NaN | NaN | NaN | NaN | NaN | 13.0 | NaN | NaN | NaN | NaN | NaN | NaN | |
8 | -2 | NaN | 135.0 | 136.0 | NaN | 138.0 | NaN | NaN | NaN | NaN | 140.0 | NaN | 142.0 | NaN | NaN | NaN | NaN |
9 | -1 | NaN | 128.5 | 130.0 | NaN | 131.0 | NaN | NaN | NaN | NaN | 133.0 | NaN | NaN | NaN | NaN | NaN | NaN |
Ionization energies¶
To fetch ionization energies use fetch_ionization_energies
that takes a degree
(default is degree=1
) argument that can either be a single integer or a list if integers to fetch multiple ionization energies.
[17]:
from mendeleev.fetch import fetch_ionization_energies
[25]:
ies = fetch_ionization_energies(degree=2)
ies.head(10)
[25]:
IE2 | |
---|---|
atomic_number | |
1 | NaN |
2 | 54.417763 |
3 | 75.640094 |
4 | 18.211153 |
5 | 25.154830 |
6 | 24.384500 |
7 | 29.601250 |
8 | 35.121110 |
9 | 34.970810 |
10 | 40.962960 |
[24]:
ies_multiple = fetch_ionization_energies(degree=[1, 3, 5])
ies_multiple.head(10)
[24]:
IE1 | IE3 | IE5 | |
---|---|---|---|
atomic_number | |||
1 | 13.598434 | NaN | NaN |
2 | 24.587388 | NaN | NaN |
3 | 5.391715 | 122.454354 | NaN |
4 | 9.322699 | 153.896198 | NaN |
5 | 8.298019 | 37.930580 | 340.226008 |
6 | 11.260296 | 47.887780 | 392.090500 |
7 | 14.534130 | 47.445300 | 97.890130 |
8 | 13.618054 | 54.935540 | 113.898900 |
9 | 17.422820 | 62.708000 | 114.249000 |
10 | 21.564540 | 63.423310 | 126.247000 |
Electronegativities¶
To fetch all data from electronegatuivity scales use fetch_electronegativities
. This can take a few seconds since most of the values need to be computed.
[26]:
from mendeleev.fetch import fetch_electronegativities
[27]:
ens = fetch_electronegativities()
ens.head(10)
[27]:
Allen | Allred-Rochow | Cottrell-Sutton | Ghosh | Gordy | Li-Xue | Martynov-Batsanov | Mulliken | Nagle | Pauling | Sanderson | |
---|---|---|---|---|---|---|---|---|---|---|---|
atomic_number | |||||||||||
1 | 13.610 | 0.000977 | 0.176777 | 0.263800 | 0.031250 | {('I', ''): -3.540721753312244, ('II', ''): -2... | 3.687605 | 6.799217 | 0.605388 | 2.20 | 2.187771 |
2 | 24.590 | 0.000803 | 0.192241 | 0.442712 | 0.036957 | {} | 6.285107 | 12.293694 | 1.130639 | NaN | 1.000000 |
3 | 5.392 | 0.000073 | 0.098866 | 0.105093 | 0.009774 | {('IV', ''): 1.7160634314550876, ('VI', ''): 1... | 2.322007 | 2.695857 | 0.182650 | 0.98 | 0.048868 |
4 | 9.323 | 0.000187 | 0.138267 | 0.144986 | 0.019118 | {} | 3.710381 | 4.661350 | 0.375615 | 1.57 | 0.126847 |
5 | 12.130 | 0.000360 | 0.174895 | 0.184886 | 0.030588 | {} | 4.877958 | 4.149010 | 0.526974 | 2.04 | 0.254627 |
6 | 15.050 | 0.000578 | 0.208167 | 0.224776 | 0.043333 | {} | 6.083300 | 5.630148 | 0.707393 | 2.55 | 0.427525 |
7 | 18.130 | 0.000774 | 0.234371 | 0.264930 | 0.054930 | {} | 7.306768 | 7.267065 | 0.877498 | 3.04 | 0.577482 |
8 | 21.360 | 0.001146 | 0.268742 | 0.304575 | 0.072222 | {} | 8.496136 | 6.809027 | 1.042218 | 3.44 | 0.941649 |
9 | 24.800 | 0.001270 | 0.285044 | 0.344443 | 0.081250 | {} | 9.701808 | 8.711410 | 1.232373 | 3.98 | 1.017681 |
10 | 28.310 | 0.001303 | 0.295488 | 0.384390 | 0.087313 | {} | 10.918389 | NaN | 1.443255 | NaN | 1.000000 |