Input¶
Reference
Find the reference for the input module here.
Capture the input data for the run. The Input class is the main
holding place for a decision model's input data. An input is built through
options and data. An input is loaded from a source, through the
InputLoader class. You may use the load function to
build an input from a source, or call the .load method on the InputLoader
class.
The most common source, and the one used by Nextmv Cloud, is either stdin or
the local filesystem. The LocalInputLoader class is
provided for this reason and it is the default input loader used by the load
function.
JSON inputs¶
Work with JSON inputs. This is the default input format for Nextmv.
import nextmv
# Read JSON from stdin.
json_input_1 = nextmv.load()
print(json_input_1.data)
# Can also specify JSON format directly, and read from a file.
json_input_2 = nextmv.load(input_format=nextmv.InputFormat.JSON, path="input.json")
print(json_input_2.data)
TEXT inputs¶
Work with plain, utf-8 encoded, text inputs. Note that the data is not
limited to exist as a .txt file, any file with utf-8 encoded text can be
used as input, like .mip, or .lp files.
import nextmv
# Read text from stdin.
text_input_1 = nextmv.load(input_format=nextmv.InputFormat.TEXT)
print(text_input_1.data)
# Can also read from a file.
text_input_2 = nextmv.load(input_format=nextmv.InputFormat.TEXT, path="input.txt")
print(text_input_2.data)
CSV_ARCHIVE inputs¶
Work with one, or multiple, CSV files. In the resulting .data property of
the input, the keys are the filenames and the values are the dataframes,
represented as a list of dictionaries. Each CSV file must be utf-8 encoded.
import nextmv
# Read multiple CSV files from a dir named "input".
csv_archive_input_1 = nextmv.load(input_format=nextmv.InputFormat.CSV_ARCHIVE)
print(csv_archive_input_1.data)
# Read multiple CSV files from a custom dir.
csv_archive_input_2 = nextmv.load(input_format=nextmv.InputFormat.CSV_ARCHIVE, path="custom_dir")
print(csv_archive_input_2.data)
MULTI_FILE inputs¶
When you need to work with a diverse set of files, use the MULTI_FILE input
format. Multi-file supports the following file formats:
.json- Text (utf-8 encoded text)
.csv(which must be utf-8 encoded).xlsx(Excel files)
To work with multi-file inputs, you need to define one or more
DataFile classes, each of which is associated with a file. You
can use the following convenience functions to create these classes:
json_data_file: load a.jsonfile.csv_data_file: load a.csvfile.text_data_file: load a text file. Any file withutf-8encoded text can be used, like.mip, or.lpfiles.
import nextmv
# Define a data file for a JSON file.
json_file = nextmv.json_data_file("input.json")
# Define a data file for a CSV file.
csv_file = nextmv.csv_data_file("input.csv")
# Define a data file for a text file.
text_file = nextmv.text_data_file("input.txt")
# Load the multi-file input with the defined data files from a dir named "inputs".
multi_file_input_1 = nextmv.load(
input_format=nextmv.InputFormat.MULTI_FILE,
data_files=[json_file, csv_file, text_file],
)
print(multi_file_input_1.data)
# Load the multi-file input with the defined data files from a custom dir.
multi_file_input_2 = nextmv.load(
input_format=nextmv.InputFormat.MULTI_FILE,
path="custom_dir",
data_files=[json_file, csv_file, text_file],
)
print(multi_file_input_2.data)
The resulting .data property of the Input object will contain a dictionary
where the keys are the names of the files, and the values are the data loaded
from those files.
When working with binary files, such as Excel files, you must define your own
DataFile class. The most important parameter of this class is the .loader,
which is a Callable (function) that you provide. The signature of this
function is as follows:
The file_path establishes the location where this data is read from. The
.name defined in the class is going to be given to this function, with the
correct directory already joined. This .loader can receive additional
arguments and keyword arguments, which you can define in the DataFile class
through the .loader_args and .loader_kwargs parameters.
from typing import Any
import nextmv
# Define a custom loader for an Excel file.
def excel_loader(file_path: str) -> Any:
import pandas as pd
return pd.read_excel(file_path, sheet_name=None)
# Define a data file for an Excel file.
excel_file = nextmv.DataFile(
name="input.xlsx",
loader=excel_loader,
loader_args=[], # Optional, you don't need to define this if no args are needed.
loader_kwargs={}, # Optional, you don't need to define this if no kwargs are needed.
)
# Load the multi-file input with the defined data files from a dir named "inputs".
multi_file_input_3 = nextmv.load(
input_format=nextmv.InputFormat.MULTI_FILE,
data_files=[excel_file],
)
print(multi_file_input_3.data)
# Load the multi-file input with the defined data files from a custom dir.
multi_file_input_4 = nextmv.load(
input_format=nextmv.InputFormat.MULTI_FILE,
path="custom_dir",
data_files=[excel_file],
)
print(multi_file_input_4.data)
As mentioned, the .data property of the Input object will contain a dictionary
where the keys are the names of the files, and the values are the data loaded
from those files. If you wish to customize the key names, you can use the
.input_data_key parameter in the DataFile class. The convenience functions
also support this argument, allowing you to specify a custom key name
for the data loaded from the file.
Here is an example of how to use the .input_data_key parameter with the
convenience functions and the DataFile class:
from typing import Any
import nextmv
# Define a data file for a JSON file with a custom key name.
json_file = nextmv.json_data_file("input.json", input_data_key="custom_json_key")
# Define a data file for a CSV file with a custom key name.
csv_file = nextmv.csv_data_file("input.csv", input_data_key="custom_csv_key")
# Define a data file for a text file with a custom key name.
text_file = nextmv.text_data_file("input.txt", input_data_key="custom_text_key")
# Define a custom loader for an Excel file that reads the first sheet.
def excel_loader(file_path: str) -> Any:
import pandas as pd
return pd.read_excel(file_path, sheet_name=0).to_dict()
# Define a data file for an Excel file using the custom loader and a custom key
# name.
excel_file = nextmv.DataFile(
name="input.xlsx",
loader=excel_loader,
loader_args=[], # Optional, you don't need to define this if no args are needed.
loader_kwargs={}, # Optional, you don't need to define this if no kwargs are needed.
input_data_key="custom_excel_key", # Custom key name for the data loaded from the file.
)
# Load the multi-file input with the defined data files from a dir named "inputs".
multi_file_input_5 = nextmv.load(
input_format=nextmv.InputFormat.MULTI_FILE,
data_files=[json_file, csv_file, text_file, excel_file],
)
# View the loaded data.
nextmv.write(multi_file_input_5.data)
$ python main.py
{
"custom_json_key": {
"message": "Hello from JSON",
"numbers": [
1,
2,
3
]
},
"custom_csv_key": [
{
"name": "Alice",
"age": "25",
"city": "New York"
},
{
"name": "Bob",
"age": "30",
"city": "London"
},
{
"name": "Charlie",
"age": "35",
"city": "Tokyo"
}
],
"custom_text_key": "This is a test text file.\nIt contains multiple lines.\nHello from text!",
"custom_excel_key": {
"a": {
"0": 1
},
"b": {
"0": 2
}
}
}