When data practitioners talk about their data environments, they usually emphasize scale and volume. One of Acceldata’s customers, for example, manages dozens of applications and applies a variety of open-source tools to help manage 20 billion+ petabytes of data in a multi-layer stack that relies on both on-prem and cloud platforms. Another scaled their data infrastructure from 70 to more than 1500 nodes, all while achieving 99.97% availability across their Hadoop infrastructure.
It’s impressive stuff, but there is an unheralded story behind all of this, and that’s the uptime, availability, and performance of the platforms that lie at the foundation of these data environments. For companies to operate effectively, the systems that fuel their operations have to be functional and available when needed, which is pretty much all the time. When systems are performing – and when they prove that they are persistently capable of maintaining high levels of performance – data teams operate with confidence.
At Acceldata we’ve made a commitment to be transparent to our customers about the status, availability, and performance of our Acceldata Data Observability Cloud. It’s an essential part of our commitment to customers and an important way to earn their trust. Our uptime isn’t just a vanity metric, however. By providing an always-on dashboard of availability, data teams can anticipate where a potential problem might exist and how to adapt their own SLAs and expectations.
Beyond the trust factor, this type of reporting facilitates efficient processes and saves time for internal teams who manage an enterprise’s data, applications, and platforms. When a service is unavailable, they have to be able to quickly determine the source. If it’s an internal issue, they must initiate a process to identify root causes and put remediation plans in place. Awareness is critical to all of this because it enables teams to perform rapid remediation.
This type of transparency has become expected by customers. Like Acceldata, data leaders such as Reddit, Twilio, SquareSpace, and others provide these insights through their own dashboards. Also, like us, these and many other data companies use the Statuspage service (developed by Atlassian) to capture and display this information (it is among a handful of services that offer these dashboards). Other tools, like Downdetector, provide a platform for crowd-sourced performance insights and user feedback.
Each organization provides their own level of detail for their cloud services. Github, for example, provides insights into performance and throttling, and does so with insights based on user location. Microsoft 365 has so many users and a complex set of variables that impact its performance, so it provides details in a more macro fashion.
Improving Platform Usability and Experience
The Acceldata Data Observability Cloud (ADOC) gathers a huge array of metrics by reading and processing raw data as well as meta information from underlying data sources. It allows data engineers and data scientists to monitor compute performance and validate data quality policies defined within the system. It’s only when these systems are all operating in concert that we can deliver a fully available and usable platform to our users.
You can see the range of sources and their interplay in this visual of the platform in this diagram of the ADOC architecture:
Think of it this way: In a general sense, the Acceldata environment is a cloud-based platform that operates through the ADOC platform that was developed on AWS. The ADOC platform can be thought of as the "environment" component of Acceldata. The Amazon Web Services (AWS) cloud is used as its foundation. You can see that the ADOC platform was built with the assistance of several different AWS services. Acceldata is in charge of all management tasks for the AWS cloud that is utilized for hosting the platform.
The customer environment consists of the customer's data and infrastructure, which is monitored by the ADOC platform. The environment varies for each Acceldata customer. Therefore, the ADOC platform is customized to meet the specific needs of each customer's infrastructure and data. This ensures that the platform can effectively monitor and manage the customer environment, providing valuable insights and analytics to improve performance and efficiency. Overall, the ADOC platform is an essential tool for Acceldata customers to optimize their data operations and achieve their business objectives.
Transparency for ADOC Performance and Security
Rather than just an overall status on whether or not our platform is functioning correctly, we provide a comprehensive view into the various systems that support ADOC. Our users can track the specific components of our data observability capabilities to determine when, and if, any of them are experiencing issues. They can also drill into what those issues are/were.
Acceldata adheres to SOC 2 compliance standards (Acceldata has achieved SOC 2 certification) and follows industry best practices across the board to ensure ADOC’s security and the client's data privacy. Some of the components of our security program and system architecture include:
- Acceldata will solely gather metadata, query results, logs, and metrics to diagnose data reliability and compute issues.
- Your data will be used solely to build your own reports and will not be shared with any external parties.
- Raw consumer data is never saved in the Acceldata-hosted control plane.
- The processing is carried out on secure servers provided by the Amazon Web Services. All storage systems are encrypted, and all servers are strictly monitored and audited. At all times, data is encrypted in transit.
- In cases where debugging or maintenance work is required, a minimal number of engineers will be permitted to access the data necessary for this purpose. All engineers use encrypted laptops and are required to remove data from their devices when their debugging session is complete. No customer data is copied to engineer laptops. Laptop security policies are enforced using MDM.
- Acceldata will access your environment via a set of published static IP addresses, allowing you to secure network-level access to your data resources.
- A yearly penetration test is performed to assess Acceldata's posture and find vulnerabilities. The final test was performed in September 2021, and the report is accessible upon request.
- Acceldata's service is built on highly accessible and redundant cloud services, primarily on Amazon Web Services in the US East-2 region.
- Strong passwords and multi-factor authentication are used to protect access to all essential systems and production environments. SSO is used for centralized access control whenever possible. Access is reviewed prior to being granted and then periodically thereafter.
In the course of our cloud platform operations, Acceldata also collects some of your information for data processing and other operations.
You can see the specifics of data collected by our data reliability capability, and the Snowflake and Databricks data we collect at the compute level in our documentation.
Availability and Performance at Scale
Consider what happens when all of these systems function as they’re supposed to. First off, the data being collected, transmitted, and transacted upon is validated for security measures. That keeps our customers, their environments, and their data safe. Secondly, it allows them to operate with true operational integrity; in other words, they know when their own data is being observed and any potential issues that might impact that effort. For our customers, many of whom are operating with incredibly high data volume, this is business critical. They have to know the status of their data operations so they can ensure their business is functioning as intended.
See ADOC in action with a quick demo. We'd love to know more about how we can help you solve your data operational, cost, and performance needs.
Photo by Etienne Girardet on Unsplash