:doc:`Personalize <../../personalize>` / Client / create_dataset

**************
create_dataset
**************



.. py:method:: Personalize.Client.create_dataset(**kwargs)

  

  Creates an empty dataset and adds it to the specified dataset group. Use `CreateDatasetImportJob <https://docs.aws.amazon.com/personalize/latest/dg/API_CreateDatasetImportJob.html>`__ to import your training data to a dataset.

   

  There are 5 types of datasets:

   

  
  * Item interactions
   
  * Items
   
  * Users
   
  * Action interactions
   
  * Actions
  

   

  Each dataset type has an associated schema with required field types. Only the ``Item interactions`` dataset is required in order to train a model (also referred to as creating a solution).

   

  A dataset can be in one of the following states:

   

  
  * CREATE PENDING > CREATE IN_PROGRESS > ACTIVE -or- CREATE FAILED
   
  * DELETE PENDING > DELETE IN_PROGRESS
  

   

  To get the status of the dataset, call `DescribeDataset <https://docs.aws.amazon.com/personalize/latest/dg/API_DescribeDataset.html>`__.

   

  **Related APIs**

   

  
  * `CreateDatasetGroup <https://docs.aws.amazon.com/personalize/latest/dg/API_CreateDatasetGroup.html>`__
   
  * `ListDatasets <https://docs.aws.amazon.com/personalize/latest/dg/API_ListDatasets.html>`__
   
  * `DescribeDataset <https://docs.aws.amazon.com/personalize/latest/dg/API_DescribeDataset.html>`__
   
  * `DeleteDataset <https://docs.aws.amazon.com/personalize/latest/dg/API_DeleteDataset.html>`__
  

  

  See also: `AWS API Documentation <https://docs.aws.amazon.com/goto/WebAPI/personalize-2018-05-22/CreateDataset>`_  


  **Request Syntax**
  ::

    response = client.create_dataset(
        name='string',
        schemaArn='string',
        datasetGroupArn='string',
        datasetType='string',
        tags=[
            {
                'tagKey': 'string',
                'tagValue': 'string'
            },
        ]
    )
    
  :type name: string
  :param name: **[REQUIRED]** 

    The name for the dataset.

    

  
  :type schemaArn: string
  :param schemaArn: **[REQUIRED]** 

    The ARN of the schema to associate with the dataset. The schema defines the dataset fields.

    

  
  :type datasetGroupArn: string
  :param datasetGroupArn: **[REQUIRED]** 

    The Amazon Resource Name (ARN) of the dataset group to add the dataset to.

    

  
  :type datasetType: string
  :param datasetType: **[REQUIRED]** 

    The type of dataset.

     

    One of the following (case insensitive) values:

     

    
    * Interactions
     
    * Items
     
    * Users
     
    * Actions
     
    * Action_Interactions
    

    

  
  :type tags: list
  :param tags: 

    A list of `tags <https://docs.aws.amazon.com/personalize/latest/dg/tagging-resources.html>`__ to apply to the dataset.

    

  
    - *(dict) --* 

      The optional metadata that you apply to resources to help you categorize and organize them. Each tag consists of a key and an optional value, both of which you define. For more information see `Tagging Amazon Personalize resources <https://docs.aws.amazon.com/personalize/latest/dg/tagging-resources.html>`__.

      

    
      - **tagKey** *(string) --* **[REQUIRED]** 

        One part of a key-value pair that makes up a tag. A key is a general label that acts like a category for more specific tag values.

        

      
      - **tagValue** *(string) --* **[REQUIRED]** 

        The optional part of a key-value pair that makes up a tag. A value acts as a descriptor within a tag category (key).

        

      
    

  
  :rtype: dict
  :returns: 
    
    **Response Syntax**

    
    ::

      {
          'datasetArn': 'string'
      }
      
    **Response Structure**

    

    - *(dict) --* 
      

      - **datasetArn** *(string) --* 

        The ARN of the dataset.

        
  
  **Exceptions**
  
  *   :py:class:`Personalize.Client.exceptions.InvalidInputException`

  
  *   :py:class:`Personalize.Client.exceptions.ResourceNotFoundException`

  
  *   :py:class:`Personalize.Client.exceptions.ResourceAlreadyExistsException`

  
  *   :py:class:`Personalize.Client.exceptions.LimitExceededException`

  
  *   :py:class:`Personalize.Client.exceptions.ResourceInUseException`

  
  *   :py:class:`Personalize.Client.exceptions.TooManyTagsException`

  