utils.schemas.datasets
utils.schemas.datasets
Pydantic models for datasets-related configuration
Classes
| Name | Description |
|---|---|
| DPODataset | DPO configuration subset |
| KTODataset | KTO configuration subset |
| PretrainingDataset | Pretraining dataset configuration subset |
| SFTDataset | SFT configuration subset |
| StepwiseSupervisedDataset | Stepwise supervised dataset configuration subset |
| SyntheticDataset | Synthetic dataset configuration for benchmarking and testing. |
| UserDefinedDPOType | User defined typing for DPO |
| UserDefinedKTOType | User defined typing for KTO |
| UserDefinedPrompterType | Structure for user defined prompt types |
DPODataset
utils.schemas.datasets.DPODataset()DPO configuration subset
KTODataset
utils.schemas.datasets.KTODataset()KTO configuration subset
PretrainingDataset
utils.schemas.datasets.PretrainingDataset()Pretraining dataset configuration subset
SFTDataset
utils.schemas.datasets.SFTDataset()SFT configuration subset
Methods
| Name | Description |
|---|---|
| handle_legacy_message_fields | Handle backwards compatibility between legacy message field mapping and new property mapping system. |
handle_legacy_message_fields
utils.schemas.datasets.SFTDataset.handle_legacy_message_fields(data)Handle backwards compatibility between legacy message field mapping and new property mapping system.
StepwiseSupervisedDataset
utils.schemas.datasets.StepwiseSupervisedDataset()Stepwise supervised dataset configuration subset
SyntheticDataset
utils.schemas.datasets.SyntheticDataset()Synthetic dataset configuration for benchmarking and testing.
Generates datasets with configurable sequence length, dataset size, and token ID ranges. Useful for benchmarking memory usage and speed by sequence length, and for validating weighted dataset mixes.
UserDefinedDPOType
utils.schemas.datasets.UserDefinedDPOType()User defined typing for DPO
UserDefinedKTOType
utils.schemas.datasets.UserDefinedKTOType()User defined typing for KTO
UserDefinedPrompterType
utils.schemas.datasets.UserDefinedPrompterType()Structure for user defined prompt types