Skip to the content.

An intelligent and dynamic planner for adaptive migration of big data systems (1 PhD position available)

a. Project Description

Migration of data between two different systems can become a rather complex process. The process of migration usually involves unloading the source data, passing it through a middleware that may process the data (compress it, convert it to the target data format, encrypt it etc.) and eventually loading it to the target system. In practice, the challenge does not lie only on transferring and possibly reformatting the data for the new system. On one hand, the migration requires resources that need to be allocated optimally, so that we reduce costs and migration time and avoid downtime as much as possible. In this scenario, the middleware can be a critical component of the migration. To facilitate and accelerate the process, we can split the source data in streams and parallelize the data transfer. As such, the middleware may require making multiple read and write requests to the source and target systems and deploy a MapReduce like architecture to unload, process and reload the data. Depending on the size of data, the constraints in time and budget and the availability of resources, the middleware needs to make intelligent decisions by solving complex multi-dimensional problems. On the other hand, we need to assume that other aspects of complexity may be present, including that the source, target and middleware systems may not reside in the same infrastructure or may not belong to the same proprietor. In this case, the different migration phases (unloading, reformatting, compression, encryption, reloading) may have to be deployed on different parts of the infrastructure depending on the availability of resources and in order to reduce transmission costs for larger data streams. The final requirement is that the middleware needs to be adaptive. The conditions of the migration process and infrastructure may vary during the actual migration. This variation may occur due to multi-tenancy and sharing of the resources with other applications, due to reliability issues (i.e., migration nodes failing), or due to suboptimal original planning because of uncertainty. The objective of this project is to build a decision-support system to optimize the parameters of the migration and of the infrastructure and a self-adaptive migration middleware that will be able to monitor the migration and adapt its configuration at runtime.

b. Tasks and responsibilities

The hired student will work towards the development of a prototype tool for a self-adaptive data migration mechanism. The student will develop the theoretical foundation as well as the implementation for such mechanism. The student will aim to publish in top-tier journals, including IEEE Transactions on Cloud Computing, IEEE Transaction on Big Data, IEEE Transactions on Knowledge and Data Engineering, ACM Transactions on Autonomous and Adaptive Systems, and conferences, such as SEAMS, ACSOS, ICSE, ICPC and others. The student will also be responsible for supervising and mentoring MSc and BSc students working on the project. The position is open for Winter, Summer or Fall 2024.

c. Required Skills

The student will be asked to demonstrate adequate understanding or expertise in the following topics through relevant courses (on undergraduate or graduate level) or through relevant publications in international conferences or journals. The student should consider applying if they have the expert-level skills and at least 50% of the good-level skills.

d. Application process

Upon contacting the professor to inquire for the position, the student is also asked to submit the following documents: