Input¶
Reference
Find the reference for the input
module here.
Capture the input data for the run. The Input
class is the main
holding place for a decision model's input data. An input is built through
options and data. An input is loaded from a source, through the
InputLoader
class. You may use the load
function to
build an input from a source, or call the .load
method on the InputLoader
class.
The most common source, and the one used by Nextmv Cloud, is either stdin
or
the local filesystem. The LocalInputLoader
class is
provided for this reason and it is the default input loader used by the load
function.
JSON
inputs¶
Work with JSON
inputs. This is the default input format for Nextmv.
import nextmv
# Read JSON from stdin.
json_input_1 = nextmv.load()
print(json_input_1.data)
# Can also specify JSON format directly, and read from a file.
json_input_2 = nextmv.load(input_format=nextmv.InputFormat.JSON, path="input.json")
print(json_input_2.data)
TEXT
inputs¶
Work with plain, utf-8
encoded, text inputs. Note that the data is not
limited to exist as a .txt
file, any file with utf-8
encoded text can be
used as input, like .mip
, or .lp
files.
import nextmv
# Read text from stdin.
text_input_1 = nextmv.load(input_format=nextmv.InputFormat.TEXT)
print(text_input_1.data)
# Can also read from a file.
text_input_2 = nextmv.load(input_format=nextmv.InputFormat.TEXT, path="input.txt")
print(text_input_2.data)
CSV_ARCHIVE
inputs¶
Work with one, or multiple, CSV
files. In the resulting .data
property of
the input, the keys are the filenames and the values are the dataframes,
represented as a list of dictionaries. Each CSV
file must be utf-8
encoded.
import nextmv
# Read multiple CSV files from a dir named "input".
csv_archive_input_1 = nextmv.load(input_format=nextmv.InputFormat.CSV_ARCHIVE)
print(csv_archive_input_1.data)
# Read multiple CSV files from a custom dir.
csv_archive_input_2 = nextmv.load(input_format=nextmv.InputFormat.CSV_ARCHIVE, path="custom_dir")
print(csv_archive_input_2.data)
MULTI_FILE
inputs¶
When you need to work with a diverse set of files, use the MULTI_FILE
input
format. Multi-file supports the following file formats:
.json
- Text (utf-8 encoded text)
.csv
(which must be utf-8 encoded).xlsx
(Excel files)
To work with multi-file inputs, you need to define one or more
DataFile
classes, each of which is associated with a file. You
can use the following convenience functions to create these classes:
json_data_file
: load a.json
file.csv_data_file
: load a.csv
file.text_data_file
: load a text file. Any file withutf-8
encoded text can be used, like.mip
, or.lp
files.
import nextmv
# Define a data file for a JSON file.
json_file = nextmv.json_data_file("input.json")
# Define a data file for a CSV file.
csv_file = nextmv.csv_data_file("input.csv")
# Define a data file for a text file.
text_file = nextmv.text_data_file("input.txt")
# Load the multi-file input with the defined data files from a dir named "inputs".
multi_file_input_1 = nextmv.load(
input_format=nextmv.InputFormat.MULTI_FILE,
data_files=[json_file, csv_file, text_file],
)
print(multi_file_input_1.data)
# Load the multi-file input with the defined data files from a custom dir.
multi_file_input_2 = nextmv.load(
input_format=nextmv.InputFormat.MULTI_FILE,
path="custom_dir",
data_files=[json_file, csv_file, text_file],
)
print(multi_file_input_2.data)
The resulting .data
property of the Input
object will contain a dictionary
where the keys are the names of the files, and the values are the data loaded
from those files.
When working with binary files, such as Excel files, you must define your own
DataFile
class. The most important parameter of this class is the .loader
,
which is a Callable
(function) that you provide. The signature of this
function is as follows:
The file_path
establishes the location where this data is read from. The
.name
defined in the class is going to be given to this function, with the
correct directory already joined. This .loader
can receive additional
arguments and keyword arguments, which you can define in the DataFile
class
through the .loader_args
and .loader_kwargs
parameters.
from typing import Any
import nextmv
# Define a custom loader for an Excel file.
def excel_loader(file_path: str) -> Any:
import pandas as pd
return pd.read_excel(file_path, sheet_name=None)
# Define a data file for an Excel file.
excel_file = nextmv.DataFile(
name="input.xlsx",
loader=excel_loader,
loader_args=[], # Optional, you don't need to define this if no args are needed.
loader_kwargs={}, # Optional, you don't need to define this if no kwargs are needed.
)
# Load the multi-file input with the defined data files from a dir named "inputs".
multi_file_input_3 = nextmv.load(
input_format=nextmv.InputFormat.MULTI_FILE,
data_files=[excel_file],
)
print(multi_file_input_3.data)
# Load the multi-file input with the defined data files from a custom dir.
multi_file_input_4 = nextmv.load(
input_format=nextmv.InputFormat.MULTI_FILE,
path="custom_dir",
data_files=[excel_file],
)
print(multi_file_input_4.data)
As mentioned, the .data
property of the Input
object will contain a dictionary
where the keys are the names of the files, and the values are the data loaded
from those files. If you wish to customize the key names, you can use the
.input_data_key
parameter in the DataFile
class. The convenience functions
also support this argument, allowing you to specify a custom key name
for the data loaded from the file.
Here is an example of how to use the .input_data_key
parameter with the
convenience functions and the DataFile
class:
from typing import Any
import nextmv
# Define a data file for a JSON file with a custom key name.
json_file = nextmv.json_data_file("input.json", input_data_key="custom_json_key")
# Define a data file for a CSV file with a custom key name.
csv_file = nextmv.csv_data_file("input.csv", input_data_key="custom_csv_key")
# Define a data file for a text file with a custom key name.
text_file = nextmv.text_data_file("input.txt", input_data_key="custom_text_key")
# Define a custom loader for an Excel file that reads the first sheet.
def excel_loader(file_path: str) -> Any:
import pandas as pd
return pd.read_excel(file_path, sheet_name=0).to_dict()
# Define a data file for an Excel file using the custom loader and a custom key
# name.
excel_file = nextmv.DataFile(
name="input.xlsx",
loader=excel_loader,
loader_args=[], # Optional, you don't need to define this if no args are needed.
loader_kwargs={}, # Optional, you don't need to define this if no kwargs are needed.
input_data_key="custom_excel_key", # Custom key name for the data loaded from the file.
)
# Load the multi-file input with the defined data files from a dir named "inputs".
multi_file_input_5 = nextmv.load(
input_format=nextmv.InputFormat.MULTI_FILE,
data_files=[json_file, csv_file, text_file, excel_file],
)
# View the loaded data.
nextmv.write(multi_file_input_5.data)
$ python main.py
{
"custom_json_key": {
"message": "Hello from JSON",
"numbers": [
1,
2,
3
]
},
"custom_csv_key": [
{
"name": "Alice",
"age": "25",
"city": "New York"
},
{
"name": "Bob",
"age": "30",
"city": "London"
},
{
"name": "Charlie",
"age": "35",
"city": "Tokyo"
}
],
"custom_text_key": "This is a test text file.\nIt contains multiple lines.\nHello from text!",
"custom_excel_key": {
"a": {
"0": 1
},
"b": {
"0": 2
}
}
}