{ "cells": [ { "cell_type": "markdown", "id": "a0b1707a-48e0-49ae-8dbc-dbc8ae2ac0eb", "metadata": {}, "source": [ "# Setup and settings\n", "\n", "This section shows how to import and configure the library." ] }, { "cell_type": "markdown", "id": "58a1a2df-a31d-484e-ba4f-209ee10ac964", "metadata": {}, "source": [ "## Import the library\n", "\n", "The library is composed of three submodules:\n", " - `openalex_analysis.data`: Manage the cache and the downloads from the OpenAlex API.\n", " - `openalex_analysis.analysis`: Include `openalex_analysis.data` and provides methods to run analysis on the data\n", " - `openalex_analysis.plot`: Include `openalex_analysis.analysis` and provides methods to create plots.\n", "\n", "For example, if you only need the library to manage the downloads from OpenAlex, you can import only `openalex_analysis.data` as you won't need the other methods located in `openalex_analysis.analysis` or `openalex_analysis.plot`.\n", "\n", "If you don't know, import `openalex_analysis.plot` to have all the methods available." ] }, { "cell_type": "code", "execution_count": 1, "id": "74f9cdc1-0036-47a0-8f00-c69c14c37340", "metadata": { "ExecuteTime": { "end_time": "2024-09-11T17:18:53.880834Z", "start_time": "2024-09-11T17:18:53.393745Z" } }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# If you want to work with works, you can import the library as follow:\n", "from openalex_analysis.plot import WorksPlot\n", "\n", "WorksPlot()" ] }, { "cell_type": "code", "execution_count": 2, "id": "b231a2f7-8c6b-4f14-9c4d-9a3461b878ae", "metadata": { "ExecuteTime": { "end_time": "2024-09-11T17:18:54.076634Z", "start_time": "2024-09-11T17:18:54.062696Z" } }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# If you want to work with institutions and only need the methods to manage the downloads from OpenAlex, you can import the library as follow:\n", "from openalex_analysis.data import WorksData\n", "\n", "WorksData()" ] }, { "cell_type": "markdown", "id": "07549f80-18d8-4382-8dd7-48fecbc0d42d", "metadata": {}, "source": [ "## Configure the library\n", "\n", "Example to configure the email (to use the polite pool from OpenAlex)." ] }, { "cell_type": "code", "execution_count": 3, "id": "34b6893b-e3d2-4450-93d6-5815c13bb9bd", "metadata": { "ExecuteTime": { "end_time": "2024-09-11T17:18:54.166783Z", "start_time": "2024-09-11T17:18:54.153015Z" } }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from openalex_analysis.plot import config, WorksPlot\n", "\n", "config.email = \"email@example.com\"\n", "\n", "WorksPlot()" ] }, { "cell_type": "markdown", "id": "a9de4958-5714-407e-83dd-f18fedcf0a48", "metadata": {}, "source": [ "### Default configuration\n", "\n", "Those are the default parameters. You can change them when importing the library, like in the example above with the email." ] }, { "cell_type": "code", "execution_count": 4, "id": "421b47b7-85bc-461e-a2ab-1ecb4fcab133", "metadata": { "ExecuteTime": { "end_time": "2024-09-11T17:18:54.235834Z", "start_time": "2024-09-11T17:18:54.233054Z" }, "tags": [] }, "outputs": [], "source": [ "# we need this module to set the path of 'project_data_folder_path':\n", "from os.path import join, expanduser\n", "\n", "config.email = None\n", "config.api_key = None\n", "config.openalex_url = \"https://api.openalex.org\"\n", "config.http_retry_times = 3\n", "config.disable_tqdm_loading_bar = False\n", "config.n_max_entities = 10000\n", "config.project_data_folder_path = join(expanduser(\"~\"), \"openalex-analysis\", \"data\")\n", "config.parquet_compression = \"brotli\"\n", "config.max_storage_percent = 95\n", "config.max_storage_files = 10000\n", "config.max_storage_size = 5e9\n", "config.min_storage_files = 1000\n", "config.min_storage_size = 5e8\n", "config.cache_max_age = 365\n", "config.log_level = 'WARNING'" ] }, { "cell_type": "markdown", "id": "df0b922f-91e0-4f24-8968-8da69ad086a5", "metadata": {}, "source": [ "### Use a configuration file\n", "\n", "To avoid having to set the configuration of the library each time you import it, you can use a configuration file.\n", "\n", "When the library is imported, if a configuration file exists at `~/openalex-analysis/openalex-analysis-conf.toml`, it is automatically loaded.\n", "\n", "Alternatively, you can load this configuration as follows if it is located at another location:" ] }, { "cell_type": "code", "execution_count": 5, "id": "e32f05f6-f963-4807-9f15-76fa10b6d441", "metadata": {}, "outputs": [], "source": [ "from openalex_analysis.analysis import load_config_from_file\n", "\n", "load_config_from_file(\"my-openalex-analysis-conf.toml\")" ] }, { "cell_type": "markdown", "id": "1b6052bc-9560-40f1-bce8-73feb53019e7", "metadata": {}, "source": [ "Example of configuration file:" ] }, { "cell_type": "code", "execution_count": 6, "id": "aa7d6c85-9024-4aaa-bb8b-1d77eb7fe35c", "metadata": {}, "outputs": [], "source": [ "n_max_entities = 10000\n", "log_level = 'WARNING'" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.3" } }, "nbformat": 4, "nbformat_minor": 5 }