Cloud Architect To Data Engineering
For those of you who want to enter the Data Science field, this is the route you should take Ft: BowTied_Raptor
Celt Note: A real treat today anons, we have BowTied_Raptor here to talk about how a cloud architect person could pivot to data engineering. A booming field that works very closely with data scientists. If you are interested in data engineering after this read, you can read another post i wrote for raptor here. And one last reminder Raptor and Whitebelt both have articles on resume help if you want to apply to data engineering roles which can be found here and here.
If you’ve been reading BowTiedCelt for quite some time, then you will have already picked up some serious skills when it comes to cloud architect. You are definitely quite skilled when it comes to working with tech that deals with cloud computing, management, and maintenance. On more of the technical side, you would’ve nabbed several technical skills in these:
Linux: Linux is used quite frequently for cloud development. You probably understand how to read schemas, architecture, and you probably already know how to maintain several Linux servers.
Database skills: Since you will be working with database administration a lot, you probably already have a solid set of cloud database management skills and a lot of knowledge on SQL servers.
Virtualizations: You are probably quite good when it comes to deploying and running software on virtual machines, and how you can create computer networks with them.
Cloud Computing: You probably already know what different cloud service providers area already available, and based off the business you are dealing with, which specific service provider to use, and also how to set things up for them.
If you were wondering, is there another role similar to cloud engineering, but focused more on the actual data side of things, and just a little bit less on the actual architecture, then the answer to that question is yes. The name for the role you are looking for is called a Data Engineer, and they work primarily in the Data Science Team.
If you are interested, you can use this button below to subscribe.
Data Engineers
Data Science
Data Science is a field that uses knowledge from: linear algebra, statistical modelling, and programming in order to create algorithms that extract knowledge, and insights from noisy data, and apply the hindsight learned across several different domains. Typically, in Data Science, you work with what’s known as big data: this could be several thousands of images in Red Green Blue (RGB) layers (Image Recognition), this could be text data that was extracted from Facebook (Natural Language Processing), or this could be a company’s stock information (Quants).
You can read more about the fundamentals of Data Science by clicking here.
Data Engineering
In the field of Data Science, there is a role called a Data Engineer who focuses less on the analytical side of things, but a lot more on the architecture/infrastructure side of things. Data Engineers design and build systems to collect, and store data on a massive scale. They are primarily focused on making sure the data that has been collected is in a highly usable state by the time it reaches the analytics team.
Here are some common day to day responsibilities of Data Engineers:
Acquire datasets from outside of the business
Build, test, and maintain data pipelines and data architectures
Create new data validation tools in order to clean up and get the data to a useable format
Create and maintain databases for optimized data pipelines
Data engineers also need to understand how to develop dashboards, and how to set up optimized data retrieval processes, and how to integrate cloud computing in the processes whenever needed. Smaller organizations typically have 2 to 3 data engineers inside the data science team, while larger organizations have a completely separate data engineering team. Since Data Engineers work with the actual architecture side of things, it is a very technical position, which requires the person to have strong skills in areas like programming, mathematics, and computer science.
If you want to know more about Data Engineering as a profession, click here.
The Data Engineer Toolset
Here is some of the tools that Data Engineers use for scaling and storing the data:
Scala: Scala is a general purpose programming language (like Python), it is primarily meant for scaling and increasing performance.
Apache Spark: It’s a tool that’s meant for programming clusters and specifically designed to process large sets of data.
Java: It’s a simple programming language, although a pain in the butt to work with, it’s very scalable.
Linux: Data Engineers use this for the same reason that Cloud Engineers use.
Amazon Web Services (AWS): A lot of data engineers use AWS constantly.
Data Engineer and Cloud Architect Skill Overlap
Both Data Engineers and Cloud Architects are expected to be really good when it comes to using Linux. You will also notice there although there isn’t a perfect match when it comes to using the exact same tools, they are similar enough to the point that you could easily translate your skills when it comes to setting up and maintaining databases from 1 to another.
The biggest challenge that Cloud Architects will face is that they will be either forced to learn how to use Java, or Python/R when it comes to the actual building of the processes for the pipelines and data quality. More specifically, they will be tasked to code processes that are focused on the quality aspect, and at the same time must be extremely scalable, so in other words, you will be forced to write extremely scalable code in Java/Python/R which could legitimately be a problem if you’ve never touched any of the 3 programming languages before.
Other than that, it should essentially be smooth scaling because of the huge amount of overlap in the skillset.
Celt Note: hope you all enjoyed that as much as i did, consider taking advantage of this discount, it is worth it frens. It pays to be early :)