Pathe - A Scalable Data Platform Built on Amazon Web Services - GoDataDriven

Sdílet
Vložit
  • čas přidán 8. 09. 2024
  • Although we did not yet have a formal plan in place, we knew that we had access to a lot of data, that we are sitting on a pot of gold, and we needed to start utilizing this.
    That realization formed the start of a project in which Pathé started applying smart analytics and insights to optimize its business processes as well as its marketing activities.
    Pathé asked Xebia to develop a highly secure central data platform on Amazon Web Services.
    On top of this platform, Xebia developed recommenders and visitor analyses, empowering Pathé to leverage data in its strive to remain the best-loved cinema company of the Netherlands.
    Around four years ago, I started my journey at Pathé , with the objective to professionalize the business intelligence. We quickly experienced that a data-driven way of working had a positive impact on our management operations. We learned that we needed to develop new capabilities to apply data on a more operational level as well. For example, our marketing department would like to have access to information about the effect of a specific campaign.
    Pathé asked Xebia to implement a central data platform and to develop smart applications.
    One of these applications is a feature to display personal recommendations directly from the Pathé app.
    We noticed that our response rates increased dramatically when we recommended our visitors movies based on data rather than on gut feeling.
    The platform has been developed using the managed services of AWS. The most important component here is S3. For Pathé, we implemented several S3 stations throughout the data workflow. The first station was set up as a landing zone for all raw data to make sure that personally identifiable data is stored securely.
    Xebia also developed applications in different areas for Pathe, including a prediction of the expected number of visitors per location. This prediction is used to optimize the purchasing process.
    The objective was to use a six-week period in which we developed features to predict the number of visitors per day and per hour, per location. The model leverages various data sources, including some specific data sets like the schedule of Ajax and weather forecast data.
    We selected AWS Sagemaker for the notebooks and for the training and hosting of models.
    ECS is used to host Airflow, which in turn is used as a solution for the ETL pipelines. Besides, we use RDS to store data.
    Our organization initially was somewhat reluctant to store data in the cloud. That is why Xebia made sure that security was a key element in each step of the way.
    For a cloud engineer, the architecture of the Pathé data platform is easy to understand, as it makes use of managed services and straightforward connections. What makes it interesting is the way that these connections have been developed.
    The team had a direct relationship with the end-users. Binx Cloud Control has been implemented for support.
    As Pathé is still in the process of building up its cloud expertise, Cloud Control is a valuable add-on. As a partner, Binx can provide services and support the Pathé organization, making sure that Pathé utilizes the platform safely and securely.
    We have only just begun, I am excited about what lies ahead on our data-driven journey!

Komentáře •