Setup and settings
This section shows how to import and configure the library.
Import the library
The library is composed of three submodules: - openalex_analysis.data
: Manage the cache and the downloads from the OpenAlex API. - openalex_analysis.analysis
: Include openalex_analysis.data
and provides methods to run analysis on the data - openalex_analysis.plot
: Include openalex_analysis.analysis
and provides methods to create plots.
For example, if you only need the library to manage the downloads from OpenAlex, you can import only openalex_analysis.data
as you won’t need the other methods located in openalex_analysis.analysis
or openalex_analysis.plot
.
If you don’t know, import openalex_analysis.plot
to have all the methods available.
[1]:
# If you want to work with works, you can import the library as follow:
from openalex_analysis.plot import WorksPlot
WorksPlot()
[1]:
<openalex_analysis.plot.entities_plot.WorksPlot at 0x7080b958be60>
[2]:
# If you want to work with institutions and only need the methods to manage the downloads from OpenAlex, you can import the library as follow:
from openalex_analysis.data import WorksData
WorksData()
[2]:
<openalex_analysis.data.entities_data.WorksData at 0x7080e4310a10>
Configure the library
Example to configure the email (to use the polite pool from OpenAlex).
[3]:
from openalex_analysis.plot import config, WorksPlot
config.email = "email@example.com"
WorksPlot()
[3]:
<openalex_analysis.plot.entities_plot.WorksPlot at 0x7080b8001490>
Default configuration
Those are the default parameters. You can change them when importing the library, like in the example above with the email.
[4]:
# we need this module to set the path of 'project_data_folder_path':
from os.path import join, expanduser
config.email = None
config.api_key = None
config.openalex_url = "https://api.openalex.org"
config.http_retry_times = 3
config.disable_tqdm_loading_bar = False
config.n_max_entities = 10000
config.project_data_folder_path = join(expanduser("~"), "openalex-analysis", "data")
config.parquet_compression = "brotli"
config.max_storage_percent = 95
config.max_storage_files = 10000
config.max_storage_size = 5e9
config.min_storage_files = 1000
config.min_storage_size = 5e8
config.cache_max_age = 365
config.log_level = 'WARNING'
Use a configuration file
To avoid having to set the configuration of the library each time you import it, you can use a configuration file.
When the library is imported, if a configuration file exists at ~/openalex-analysis/openalex-analysis-conf.toml
, it is automatically loaded.
Alternatively, you can load this configuration as follows if it is located at another location:
[5]:
from openalex_analysis.analysis import load_config_from_file
load_config_from_file("my-openalex-analysis-conf.toml")
Example of configuration file:
[6]:
n_max_entities = 10000
log_level = 'WARNING'