I’ve seen a few posts recently about the emergence of a new field that is kind of like DevOps, but not quite, because it involves too much data. Verbally, about two years ago and in blog form about a year ago, I used the word DataDevOps, because that’s what I did. I develop and operate Data Science platforms, products and services. But more recently I have read of the emergence of DataOps.
Developing Jenkinsfile pipelines is hard. I think my world record for the number of attempts to get a working Jenkinsfile is around 20. When you have to continually push and run your pipeline on a managed Jenkins instance, the feedback cycle is long. And the primary bottleneck to developer productivity is the length of the feedback cycle.
Nearly everyone using Python for Data Science has used or is using the Pandas Data Analysis/Preprocessing library. It is as much of a mainstay as Scikit-Learn. Despite this, one continuing bugbear is the different core data types used by each: pandas.DataFrame and np.array. Wouldn’t it be great if we didn’t have to worry about converting DataFrames to numpy types and back again? Yes, it would. Step forward Scikit Pandas.
Go introduced vendoring into version 1.5 of the language. The vendor folder is used as a dependency cache for a project. Because of the unique way Go handles dependencies, the cache is full code from an entire repository; worts and all. Go will search the vendor folder for its dependencies before it searches the global GOPATH. Tools have emerged to corral the vendor folder and one of my favourites is glide.
The testing of microservices is inherently more difficult than testing monoliths due to the distributed nature of the code under test. But distributed applications are worth pursuing because by definition they are decoupled and scalable.
With planning, the result is a pipeline that automatically ensures quality. The automated assurance of quality becomes increasingly important in larger projects, because no one person wants to or is able to ensure the quality of the application as a whole.
This article provides some guidelines that I have developed whilst working on a range of software with microservice architectures. I attempt to align the concepts with best practice (references at the end of this article), but some of the terminology is my own.
I recently undertook a time-boxed four hour spike to investigate another Go microservices framework. Go-Micro is a “RPC framework for microservices”. It aims to provide common components that are often used in microservice deployments. It advertises itself as providing a pluggable architecture and boasts a long list of compatibilities.
When I think about my business, I often end up at one fundamental business truth. The goal of any business is to make profit. This is achieved by either making more money or reducing costs. Often, business activities are outsourced to experts in order to achieve one of these. For example, I outsource my accounting to an expert, in order to a) ensure that I’m being tax efficient (save costs) and b) because I lack the expertise and the time/desire to obtain it (save costs).
Most traditional outsourcing routes (e.g. accountancy, recruiting, etc.) affect either making more money (e.g. marketing) or reducing costs (e.g. recruiting). Outsourcing research and development is different, in that it can benefit your business in both of these ways.