:doc:`Glue <../../glue>` / Client / start_data_quality_rule_recommendation_run

******************************************
start_data_quality_rule_recommendation_run
******************************************



.. py:method:: Glue.Client.start_data_quality_rule_recommendation_run(**kwargs)

  

  Starts a recommendation run that is used to generate rules when you don't know what rules to write. Glue Data Quality analyzes the data and comes up with recommendations for a potential ruleset. You can then triage the ruleset and modify the generated ruleset to your liking.

   

  Recommendation runs are automatically deleted after 90 days.

  

  See also: `AWS API Documentation <https://docs.aws.amazon.com/goto/WebAPI/glue-2017-03-31/StartDataQualityRuleRecommendationRun>`_  


  **Request Syntax**
  ::

    response = client.start_data_quality_rule_recommendation_run(
        DataSource={
            'GlueTable': {
                'DatabaseName': 'string',
                'TableName': 'string',
                'CatalogId': 'string',
                'ConnectionName': 'string',
                'AdditionalOptions': {
                    'string': 'string'
                }
            },
            'DataQualityGlueTable': {
                'DatabaseName': 'string',
                'TableName': 'string',
                'CatalogId': 'string',
                'ConnectionName': 'string',
                'AdditionalOptions': {
                    'string': 'string'
                },
                'PreProcessingQuery': 'string'
            }
        },
        Role='string',
        NumberOfWorkers=123,
        Timeout=123,
        CreatedRulesetName='string',
        DataQualitySecurityConfiguration='string',
        ClientToken='string'
    )
    
  :type DataSource: dict
  :param DataSource: **[REQUIRED]** 

    The data source (Glue table) associated with this run.

    

  
    - **GlueTable** *(dict) --* 

      An Glue table.

      

    
      - **DatabaseName** *(string) --* **[REQUIRED]** 

        A database name in the Glue Data Catalog.

        

      
      - **TableName** *(string) --* **[REQUIRED]** 

        A table name in the Glue Data Catalog.

        

      
      - **CatalogId** *(string) --* 

        A unique identifier for the Glue Data Catalog.

        

      
      - **ConnectionName** *(string) --* 

        The name of the connection to the Glue Data Catalog.

        

      
      - **AdditionalOptions** *(dict) --* 

        Additional options for the table. Currently there are two keys supported:

         

        
        * ``pushDownPredicate``: to filter on partitions without having to list and read all the files in your dataset.
         
        * ``catalogPartitionPredicate``: to use server-side partition pruning using partition indexes in the Glue Data Catalog.
        

        

      
        - *(string) --* 

        
          - *(string) --* 

          
    
  
    
    - **DataQualityGlueTable** *(dict) --* 

      An Glue table for Data Quality Operations.

      

    
      - **DatabaseName** *(string) --* **[REQUIRED]** 

        A database name in the Glue Data Catalog.

        

      
      - **TableName** *(string) --* **[REQUIRED]** 

        A table name in the Glue Data Catalog.

        

      
      - **CatalogId** *(string) --* 

        A unique identifier for the Glue Data Catalog.

        

      
      - **ConnectionName** *(string) --* 

        The name of the connection to the Glue Data Catalog.

        

      
      - **AdditionalOptions** *(dict) --* 

        Additional options for the table. Currently there are two keys supported:

         

        
        * ``pushDownPredicate``: to filter on partitions without having to list and read all the files in your dataset.
         
        * ``catalogPartitionPredicate``: to use server-side partition pruning using partition indexes in the Glue Data Catalog.
        

        

      
        - *(string) --* 

        
          - *(string) --* 

          
    
  
      - **PreProcessingQuery** *(string) --* 

        SQL Query of SparkSQL format that can be used to pre-process the data for the table in Glue Data Catalog, before running the Data Quality Operation.

        

      
    
  
  :type Role: string
  :param Role: **[REQUIRED]** 

    An IAM role supplied to encrypt the results of the run.

    

  
  :type NumberOfWorkers: integer
  :param NumberOfWorkers: 

    The number of ``G.1X`` workers to be used in the run. The default is 5.

    

  
  :type Timeout: integer
  :param Timeout: 

    The timeout for a run in minutes. This is the maximum time that a run can consume resources before it is terminated and enters ``TIMEOUT`` status. The default is 2,880 minutes (48 hours).

    

  
  :type CreatedRulesetName: string
  :param CreatedRulesetName: 

    A name for the ruleset.

    

  
  :type DataQualitySecurityConfiguration: string
  :param DataQualitySecurityConfiguration: 

    The name of the security configuration created with the data quality encryption option.

    

  
  :type ClientToken: string
  :param ClientToken: 

    Used for idempotency and is recommended to be set to a random ID (such as a UUID) to avoid creating or starting multiple instances of the same resource.

    

  
  
  :rtype: dict
  :returns: 
    
    **Response Syntax**

    
    ::

      {
          'RunId': 'string'
      }
      
    **Response Structure**

    

    - *(dict) --* 
      

      - **RunId** *(string) --* 

        The unique run identifier associated with this run.

        
  
  **Exceptions**
  
  *   :py:class:`Glue.Client.exceptions.InvalidInputException`

  
  *   :py:class:`Glue.Client.exceptions.OperationTimeoutException`

  
  *   :py:class:`Glue.Client.exceptions.InternalServiceException`

  
  *   :py:class:`Glue.Client.exceptions.ConflictException`

  