Cloud-Native Data Science Blog

Have you ever wondered what cloud-native actually is? Are you confused about how to use data science to improve your business? Check out these informative blogs to help you get started.

https://prometheus.io is an open source time series database that focuses on capturing measurements and exposing them via an API. I love Prometheus because it it so simple; it’s minimalism is its greatest feature. It achieves this by pulling metrics from instrumented applications, not pulling like many of its competitors. In other words Prometheus “scrapes” the metrics from the application.

This means that it works very well in a distributed, cloud-native environment. All of the services are unburdened by load on the monitoring system. This has knock on effects meaning that HA is supported through simple duplication and scaling is supported through segmentation.

What do you mean by monitoring? Why do you need it? What are the real needs and are you monitoring them? Ask yourself these questions. Can you answer them? If not, you’re probably doing monitoring wrong.

This post asks the basic question. What is monitoring? How does it compare to logging and tracing? Let’s find out.

If you ask anyone what they think AI is, they’re probably going to talk about sci-fi. Science fiction has been greatly influenced by the field of artificial intelligence, or A.I.

Probably the two most famous books about A.I. are I, Robot, released in 1950 by Isaac Asimov and 2001: A Space Odyssy, released in 1968 by Arthur C. Clarke.

I, Robot introduced the three laws of robotics. 1) A robot must not injure a human being, 2) a robot must obay the orders, except where the orders would conflict with the First Law and 3) a robot must protect its own existance as long as such protection does not conflict with the First or Second Laws.

2001: A Space Odyssey is a story about a psychopathic A.I. called HAL 9000 that intentionally tries to kill the humans on board a space station to save it’s own skin, in a sense.

But the history of AI stems back much further…

Data Science is an emerging field that is plagued by lurid, often inconsequential reports of success. The press has been all too happy to predict the future demise of the human race.

But sifting through chaff, we do see some genuinely interesting reports of work that affects both bottom-line profit and top-line revenue.

Cloud-Native, a collection of tools and best practices, disrupts the ideas behind traditional software development. I am a firm believer of the core concepts, which include visibility, repeatability, resiliency and robustness.

The idea begins in 2015 when the Linux Foundation formed the Cloud-Native Computing Foundation. The idea was to collect the tools and processes that are often employed to develop cloud-based software.

However, the result was a collection of best practices which extend well beyond the realms of the cloud. This post introduces the essential components: DevOps, continuous delivery, microservices and containers.

Last week I was working on a simple implementation updating a shopping cart for a site, the frontend was written in html/javascript. The brief - when the quantity of an item in the cart was modified the client could press an update cart button which would update the cart database, after which it was necessary to recalculate the total values of the order and refresh the page with the new totals.

The terms “Cloud” or “Cloud Services” have become so laden with buzz that they would be happy to compete with Apollo 11 or Toy Story. But the hype often hides the most important aspects that you need to know. Like how it works, or what you can do with it. This is the first of several introductory pieces that focus on the very basics of modern applications.

In one of my applications, for various reasons, we now have a batch like process and a HTTP based REST application running inside the same binary. Today I came up against an issue where HTTP latencies were around 10 seconds when the batch process was running.

After some debugging, the reason for this is that although the two are running in separate Go routines, the batch process is not allowing the scheduler to schedule the HTTP request until the batch process has finished.

Go introduced vendoring into version 1.5 of the language. The vendor folder is used as a dependency cache for a project. Because of the unique way Go handles dependencies, the cache is full code from an entire repository; worts and all. Go will search the vendor folder for its dependencies before it searches the global GOPATH. Tools have emerged to corral the vendor folder and one of my favourites is glide.

The testing of microservices is inherently more difficult than testing monoliths due to the distributed nature of the code under test. But distributed applications are worth pursuing because by definition they are decoupled and scalable.

With planning, the result is a pipeline that automatically ensures quality. The automated assurance of quality becomes increasingly important in larger projects, because no one person wants to or is able to ensure the quality of the application as a whole.

This article provides some guidelines that I have developed whilst working on a range of software with microservice architectures. I attempt to align the concepts with best practice (references at the end of this article), but some of the terminology is my own.