:doc:`CleanRoomsML <../../cleanroomsml>` / Client / create_training_dataset

***********************
create_training_dataset
***********************



.. py:method:: CleanRoomsML.Client.create_training_dataset(**kwargs)

  

  Defines the information necessary to create a training dataset. In Clean Rooms ML, the ``TrainingDataset`` is metadata that points to a Glue table, which is read only during ``AudienceModel`` creation.

  

  See also: `AWS API Documentation <https://docs.aws.amazon.com/goto/WebAPI/cleanroomsml-2023-09-06/CreateTrainingDataset>`_  


  **Request Syntax**
  ::

    response = client.create_training_dataset(
        name='string',
        roleArn='string',
        trainingData=[
            {
                'type': 'INTERACTIONS',
                'inputConfig': {
                    'schema': [
                        {
                            'columnName': 'string',
                            'columnTypes': [
                                'USER_ID'|'ITEM_ID'|'TIMESTAMP'|'CATEGORICAL_FEATURE'|'NUMERICAL_FEATURE',
                            ]
                        },
                    ],
                    'dataSource': {
                        'glueDataSource': {
                            'tableName': 'string',
                            'databaseName': 'string',
                            'catalogId': 'string'
                        }
                    }
                }
            },
        ],
        tags={
            'string': 'string'
        },
        description='string'
    )
    
  :type name: string
  :param name: **[REQUIRED]** 

    The name of the training dataset. This name must be unique in your account and region.

    

  
  :type roleArn: string
  :param roleArn: **[REQUIRED]** 

    The ARN of the IAM role that Clean Rooms ML can assume to read the data referred to in the ``dataSource`` field of each dataset.

     

    Passing a role across AWS accounts is not allowed. If you pass a role that isn't in your account, you get an ``AccessDeniedException`` error.

    

  
  :type trainingData: list
  :param trainingData: **[REQUIRED]** 

    An array of information that lists the Dataset objects, which specifies the dataset type and details on its location and schema. You must provide a role that has read access to these tables.

    

  
    - *(dict) --* 

      Defines where the training dataset is located, what type of data it contains, and how to access the data.

      

    
      - **type** *(string) --* **[REQUIRED]** 

        What type of information is found in the dataset.

        

      
      - **inputConfig** *(dict) --* **[REQUIRED]** 

        A DatasetInputConfig object that defines the data source and schema mapping.

        

      
        - **schema** *(list) --* **[REQUIRED]** 

          The schema information for the training data.

          

        
          - *(dict) --* 

            Metadata for a column.

            

          
            - **columnName** *(string) --* **[REQUIRED]** 

              The name of a column.

              

            
            - **columnTypes** *(list) --* **[REQUIRED]** 

              The data type of column.

              

            
              - *(string) --* 

              
          
          
      
        - **dataSource** *(dict) --* **[REQUIRED]** 

          A DataSource object that specifies the Glue data source for the training data.

          

        
          - **glueDataSource** *(dict) --* **[REQUIRED]** 

            A GlueDataSource object that defines the catalog ID, database name, and table name for the training data.

            

          
            - **tableName** *(string) --* **[REQUIRED]** 

              The Glue table that contains the training data.

              

            
            - **databaseName** *(string) --* **[REQUIRED]** 

              The Glue database that contains the training data.

              

            
            - **catalogId** *(string) --* 

              The Glue catalog that contains the training data.

              

            
          
        
      
    

  :type tags: dict
  :param tags: 

    The optional metadata that you apply to the resource to help you categorize and organize them. Each tag consists of a key and an optional value, both of which you define.

     

    The following basic restrictions apply to tags:

     

    
    * Maximum number of tags per resource - 50.
     
    * For each resource, each tag key must be unique, and each tag key can have only one value.
     
    * Maximum key length - 128 Unicode characters in UTF-8.
     
    * Maximum value length - 256 Unicode characters in UTF-8.
     
    * If your tagging schema is used across multiple services and resources, remember that other services may have restrictions on allowed characters. Generally allowed characters are: letters, numbers, and spaces representable in UTF-8, and the following characters: + - = . _ : / @.
     
    * Tag keys and values are case sensitive.
     
    * Do not use aws:, AWS:, or any upper or lowercase combination of such as a prefix for keys as it is reserved for AWS use. You cannot edit or delete tag keys with this prefix. Values can have this prefix. If a tag value has aws as its prefix but the key does not, then Clean Rooms ML considers it to be a user tag and will count against the limit of 50 tags. Tags with only the key prefix of aws do not count against your tags per resource limit.
    

    

  
    - *(string) --* 

    
      - *(string) --* 

      


  :type description: string
  :param description: 

    The description of the training dataset.

    

  
  
  :rtype: dict
  :returns: 
    
    **Response Syntax**

    
    ::

      {
          'trainingDatasetArn': 'string'
      }
      
    **Response Structure**

    

    - *(dict) --* 
      

      - **trainingDatasetArn** *(string) --* 

        The Amazon Resource Name (ARN) of the training dataset resource.

        
  
  **Exceptions**
  
  *   :py:class:`CleanRoomsML.Client.exceptions.ConflictException`

  
  *   :py:class:`CleanRoomsML.Client.exceptions.ValidationException`

  
  *   :py:class:`CleanRoomsML.Client.exceptions.AccessDeniedException`

  