Troubleshooting Data Collection and Analysis

Troubleshooting Data Collection and Analysis

#410160

This topic lists common data collection and analysis errors that you may encounter when working with your AWS environments.

Troubleshooting Data Collection

It is important to understand which services are analyzed when performing a CloudWatch audit. The following services are currently analyzed by Densify:

  • EC2, RDS
  • ASGs with a Launch Configuration specified;

Additionally, Densify analyzes and provides recommendations for instances that:

  • Are running or in a stopped state, at the time of the audit. Recommendations are not generated for terminated instances.
  • Meet the minimum number of required hours and the number of required days, as set in the policy.
  • Have non-transient workloads. Densify analyzes the historical workload for an instance and builds a workload profile for that instance. Non-transient workloads, by definition do not have history associated with them and therefore may be excluded

Table: AWS Connection and Data Collection Errors

Error

Description

Failed to connect to your AWS Public Cloud environment

Cause:

Densify is not able to collect data from this account due to incorrect IAM credentials.

Resolution:

Cause:

The AWS account to which Densify is connected has been deleted.

Resolution:

Within Densify you need to manually delete the Cloud Environments associated with the deleted account and create connections to the new accounts that need to be included in the analysis.

Cause:

All of the systems within the AWS account to which Densify is connected, have been deleted.

Resolution:

Densify currently analyzes EC2, RDS, ASGs and ECS clusters that are part of ASGs. During the audit, if Densify discovers that an existing Cloud Environment (which previously had audit data) now has zero systems i.e. systems have been deleted), then the audit will fail.

Within Densify you need to manually delete the Cloud Environments associated with the accounts that have no systems.

For any stale ASGs (i.e. any deleted ASGs), these Cloud Environments are auto-deleted

Cause:

All of the systems within the AWS account are not currently supported for analysis by Densify.

Resolution:

If your account only contains services that Densify does not analyze, then the audit will fail. i.e. DynamoDB, Redshift, Lambda, ElasticCache, etc.

Within Densify you need to manually delete the Cloud Environments associated with these accounts.

Table: GCP Data Collection Errors

Error/Cause

Resolution

PERMISSION_DENIED: This API method requires billing to be enabled.

Cause:

The billing method for the audited project(s) has expired or has not been configured correctly.

For example, the credit card used to setup the project has expired.

Resolution:

Refer to the GCP documentation for details of configuring a billing method for your projects:

https://support.google.com/googleapi/answer/6158867

Table: Cloud Cost Intelligence Reports are Blank

Error

Description

Failed to load billing data correctly

Cause:

In Densifyv 12.1.8 a number of tables were added to improve the performance of the billing reports. .

Resolution:

Troubleshooting Densify Analysis

Once data collection is completed, Densify analyzes the CloudWatch data and generates recommendations. For every Cloud Environment, this includes reading the audit data (i.e. workload metrics), identifying the percentiles (sustained, min, max and peak), excluding any outliers, building a predicted workload profile and then applying the policy settings to generate recommendations.

One policy associated with each control environment and those settings are used. You can change the policy or the settings and then refresh the analysis and re-populate the RDB to see the impact of your changes.

Each system being analyzed must be globally unique across all your cloud environments. In other words, each system MUST appear ONLY ONCE in a cloud environment. You cannot have duplicate systems existing in different cloud environments as it will negatively impact your results.

This table lists possible issues you may encounter during analysis refresh and possible resolutions.

Note: The procedures in the following sections are mainly performed through the Densify Analysis Console. Contact [email protected] for details on using the Analysis Console.

Table: AWS Analysis Errors

Error/Cause

Resolution

Policy changes are not displayed

Cause:

The analysis refresh is run as a scheduled job, usually overnight. Any changes made to the policy may have been made before the analyses was refreshed.

Resolution:

Check when the Cloud Environment was last refreshed. The policy changes must be made prior to the refresh.You may need to refresh (re-run the analytics) on the selected Cloud Environment. See Managing Cloud Environments

No Cloud Environments exists for the selected AWS account ID.

Cause:

If there are no valid systems (no EC2, RDS, ASGs) in the account, then a Cloud Environment will not be created

Resolution:

Contact your [email protected] to review the audit logs to see what systems were discovered.

The Cloud Environment was not been refreshed.

Cause:

The Cloud Environment has not been scheduled to refresh daily.

Resolution:

This scheduled task should be created automatically when you create the cloud connection.

If the refresh task has not been created, then contact [email protected] and provide the details and request that they configure daily refreshes for the Cloud Environment.

Cause:

If during data collection, no new data was collected for the Cloud Environment, then the refresh will not run. See the Table, above for possible causes.

Resolution:

Contact [email protected] and provide the details and request that they review the logs to see what systems, if any, were discovered.

You find multiple recommendations for the same system.

Cause:

A system is duplicated across multiple Cloud Environments.

Resolution:

As indicated above, each system must appear only once across all your Cloud Environments.

Contact [email protected] to ensure that the Cloud Environment Filters are exclusive.

Review your filter settings to ensure that the Cloud Environment Filters are configued correctly. See DCE Views

Recommendations are not displayed despite systems and cloud environments being valid.

Cause:

Missing data.

Resolution:

Contact [email protected] to verify your policy settings to ensure each system has the required amount of data.

Cause:

Missing workload metrics.

Resolution:

Contact [email protected] to verify that all the necessary metrics are being collected. i.e. CPU, memory, disk, network IO data available. Verify configuration and benchmarks are included.

Recommendations from the API do not match those displayed in the Densify Console.

Cause:

Once the analysis is run, the recommendations are available via API immediately. However, recommendations do not appear in the UI until the reporting database (RDB) has been updated. The API and UI gather data from two different tables.

Resolution:

Contact [email protected] to find out when the database (RDB) has been updated is normally updated.

Recommendations from the API do not match those displayed in the Densify Console.

Cause:

Once the analysis is run, the recommendations are available via API immediately. However, recommendations do not appear in the UI until the reporting database (RDB) has been updated. The API and UI gather data from two different tables.

Resolution:

Contact [email protected] to find out when the reporting database (RDB) is normally updated.

Recommendations do not align with defined in policy settings.

Cause:

You associate a single policy for each Cloud Environment. In some cases the policy settings are not the best fit for all systems and exceptions need to be defined.

Resolution:

This is a complex problem and there could be several reasons for your recommendations not aligning with your settings. Consider the following:

  • Are there systems where the recommendations do align with policy settings? How are these systems differrent from the systems that do not align?
  • When was the system last audited and was data collection complete for the mis-aligned system?
  • When were the analytics last refreshed for the mis-aligned system?
  • Are there any attribute overrides already applied to this system?

Contact [email protected] to review the workload and inspect the analysis results, noting any anomalies. You may need to configure policy overrides via attributes.

The Impact analysis report is not available but the recommendation is accessible via the API.

Cause:

The Impact Analysis Report is generated by the reporting database (RDB) update. It is possible for recommendations to be visible via API but the report is not yet ready.

Resolution:

Contact [email protected] to find out when the reporting database (RDB) is normally updated.

You can update the RDB by running the "populate" command from the CLI. You can review the scheduler to see when the RDB populate task is scheduled to run. See Scheduling Tasks