:doc:`Glue <../../glue>` / Client / start_data_quality_ruleset_evaluation_run

*****************************************
start_data_quality_ruleset_evaluation_run
*****************************************



.. py:method:: Glue.Client.start_data_quality_ruleset_evaluation_run(**kwargs)

  

  Once you have a ruleset definition (either recommended or your own), you call this operation to evaluate the ruleset against a data source (Glue table). The evaluation computes results which you can retrieve with the ``GetDataQualityResult`` API.

  

  See also: `AWS API Documentation <https://docs.aws.amazon.com/goto/WebAPI/glue-2017-03-31/StartDataQualityRulesetEvaluationRun>`_  


  **Request Syntax**
  ::

    response = client.start_data_quality_ruleset_evaluation_run(
        DataSource={
            'GlueTable': {
                'DatabaseName': 'string',
                'TableName': 'string',
                'CatalogId': 'string',
                'ConnectionName': 'string',
                'AdditionalOptions': {
                    'string': 'string'
                }
            },
            'DataQualityGlueTable': {
                'DatabaseName': 'string',
                'TableName': 'string',
                'CatalogId': 'string',
                'ConnectionName': 'string',
                'AdditionalOptions': {
                    'string': 'string'
                },
                'PreProcessingQuery': 'string'
            }
        },
        Role='string',
        NumberOfWorkers=123,
        Timeout=123,
        ClientToken='string',
        AdditionalRunOptions={
            'CloudWatchMetricsEnabled': True|False,
            'ResultsS3Prefix': 'string',
            'CompositeRuleEvaluationMethod': 'COLUMN'|'ROW'
        },
        RulesetNames=[
            'string',
        ],
        AdditionalDataSources={
            'string': {
                'GlueTable': {
                    'DatabaseName': 'string',
                    'TableName': 'string',
                    'CatalogId': 'string',
                    'ConnectionName': 'string',
                    'AdditionalOptions': {
                        'string': 'string'
                    }
                },
                'DataQualityGlueTable': {
                    'DatabaseName': 'string',
                    'TableName': 'string',
                    'CatalogId': 'string',
                    'ConnectionName': 'string',
                    'AdditionalOptions': {
                        'string': 'string'
                    },
                    'PreProcessingQuery': 'string'
                }
            }
        }
    )
    
  :type DataSource: dict
  :param DataSource: **[REQUIRED]** 

    The data source (Glue table) associated with this run.

    

  
    - **GlueTable** *(dict) --* 

      An Glue table.

      

    
      - **DatabaseName** *(string) --* **[REQUIRED]** 

        A database name in the Glue Data Catalog.

        

      
      - **TableName** *(string) --* **[REQUIRED]** 

        A table name in the Glue Data Catalog.

        

      
      - **CatalogId** *(string) --* 

        A unique identifier for the Glue Data Catalog.

        

      
      - **ConnectionName** *(string) --* 

        The name of the connection to the Glue Data Catalog.

        

      
      - **AdditionalOptions** *(dict) --* 

        Additional options for the table. Currently there are two keys supported:

         

        
        * ``pushDownPredicate``: to filter on partitions without having to list and read all the files in your dataset.
         
        * ``catalogPartitionPredicate``: to use server-side partition pruning using partition indexes in the Glue Data Catalog.
        

        

      
        - *(string) --* 

        
          - *(string) --* 

          
    
  
    
    - **DataQualityGlueTable** *(dict) --* 

      An Glue table for Data Quality Operations.

      

    
      - **DatabaseName** *(string) --* **[REQUIRED]** 

        A database name in the Glue Data Catalog.

        

      
      - **TableName** *(string) --* **[REQUIRED]** 

        A table name in the Glue Data Catalog.

        

      
      - **CatalogId** *(string) --* 

        A unique identifier for the Glue Data Catalog.

        

      
      - **ConnectionName** *(string) --* 

        The name of the connection to the Glue Data Catalog.

        

      
      - **AdditionalOptions** *(dict) --* 

        Additional options for the table. Currently there are two keys supported:

         

        
        * ``pushDownPredicate``: to filter on partitions without having to list and read all the files in your dataset.
         
        * ``catalogPartitionPredicate``: to use server-side partition pruning using partition indexes in the Glue Data Catalog.
        

        

      
        - *(string) --* 

        
          - *(string) --* 

          
    
  
      - **PreProcessingQuery** *(string) --* 

        SQL Query of SparkSQL format that can be used to pre-process the data for the table in Glue Data Catalog, before running the Data Quality Operation.

        

      
    
  
  :type Role: string
  :param Role: **[REQUIRED]** 

    An IAM role supplied to encrypt the results of the run.

    

  
  :type NumberOfWorkers: integer
  :param NumberOfWorkers: 

    The number of ``G.1X`` workers to be used in the run. The default is 5.

    

  
  :type Timeout: integer
  :param Timeout: 

    The timeout for a run in minutes. This is the maximum time that a run can consume resources before it is terminated and enters ``TIMEOUT`` status. The default is 2,880 minutes (48 hours).

    

  
  :type ClientToken: string
  :param ClientToken: 

    Used for idempotency and is recommended to be set to a random ID (such as a UUID) to avoid creating or starting multiple instances of the same resource.

    

  
  :type AdditionalRunOptions: dict
  :param AdditionalRunOptions: 

    Additional run options you can specify for an evaluation run.

    

  
    - **CloudWatchMetricsEnabled** *(boolean) --* 

      Whether or not to enable CloudWatch metrics.

      

    
    - **ResultsS3Prefix** *(string) --* 

      Prefix for Amazon S3 to store results.

      

    
    - **CompositeRuleEvaluationMethod** *(string) --* 

      Set the evaluation method for composite rules in the ruleset to ROW/COLUMN

      

    
  
  :type RulesetNames: list
  :param RulesetNames: **[REQUIRED]** 

    A list of ruleset names.

    

  
    - *(string) --* 

    

  :type AdditionalDataSources: dict
  :param AdditionalDataSources: 

    A map of reference strings to additional data sources you can specify for an evaluation run.

    

  
    - *(string) --* 

    
      - *(dict) --* 

        A data source (an Glue table) for which you want data quality results.

        

      
        - **GlueTable** *(dict) --* 

          An Glue table.

          

        
          - **DatabaseName** *(string) --* **[REQUIRED]** 

            A database name in the Glue Data Catalog.

            

          
          - **TableName** *(string) --* **[REQUIRED]** 

            A table name in the Glue Data Catalog.

            

          
          - **CatalogId** *(string) --* 

            A unique identifier for the Glue Data Catalog.

            

          
          - **ConnectionName** *(string) --* 

            The name of the connection to the Glue Data Catalog.

            

          
          - **AdditionalOptions** *(dict) --* 

            Additional options for the table. Currently there are two keys supported:

             

            
            * ``pushDownPredicate``: to filter on partitions without having to list and read all the files in your dataset.
             
            * ``catalogPartitionPredicate``: to use server-side partition pruning using partition indexes in the Glue Data Catalog.
            

            

          
            - *(string) --* 

            
              - *(string) --* 

              
        
      
        
        - **DataQualityGlueTable** *(dict) --* 

          An Glue table for Data Quality Operations.

          

        
          - **DatabaseName** *(string) --* **[REQUIRED]** 

            A database name in the Glue Data Catalog.

            

          
          - **TableName** *(string) --* **[REQUIRED]** 

            A table name in the Glue Data Catalog.

            

          
          - **CatalogId** *(string) --* 

            A unique identifier for the Glue Data Catalog.

            

          
          - **ConnectionName** *(string) --* 

            The name of the connection to the Glue Data Catalog.

            

          
          - **AdditionalOptions** *(dict) --* 

            Additional options for the table. Currently there are two keys supported:

             

            
            * ``pushDownPredicate``: to filter on partitions without having to list and read all the files in your dataset.
             
            * ``catalogPartitionPredicate``: to use server-side partition pruning using partition indexes in the Glue Data Catalog.
            

            

          
            - *(string) --* 

            
              - *(string) --* 

              
        
      
          - **PreProcessingQuery** *(string) --* 

            SQL Query of SparkSQL format that can be used to pre-process the data for the table in Glue Data Catalog, before running the Data Quality Operation.

            

          
        
      


  
  :rtype: dict
  :returns: 
    
    **Response Syntax**

    
    ::

      {
          'RunId': 'string'
      }
      
    **Response Structure**

    

    - *(dict) --* 
      

      - **RunId** *(string) --* 

        The unique run identifier associated with this run.

        
  
  **Exceptions**
  
  *   :py:class:`Glue.Client.exceptions.InvalidInputException`

  
  *   :py:class:`Glue.Client.exceptions.EntityNotFoundException`

  
  *   :py:class:`Glue.Client.exceptions.OperationTimeoutException`

  
  *   :py:class:`Glue.Client.exceptions.InternalServiceException`

  
  *   :py:class:`Glue.Client.exceptions.ConflictException`

  