General integration approach 1.0 (Part 2)

a dotted path goes to a platform with a floating magnifying lens followed by a set of boxes and then a floating screen with some figures on it
a dotted path goes to a platform with a floating magnifying lens followed by a set of boxes and then a floating screen with some figures on it

Quick Recap

Part 1 of this post has covered following steps of Open Data for Industries and Cloud Pak for Data service integration:

STEP 1: Cloud Pak for Data Analytics Project and Notebook set up

STEP 2: Data refinery and cleansing

STEP 3: Data Ingestion to IBM Open Data for Industries instance

This is Part 2 which will cover:

STEP 4: Data searching and retrieving data from Open Data for Industries instance

STEP 5: Data analysis and prediction using Cloud Pak for Data services


General integration approach 1.0 (Part 1)

a dotted path goes to a platform with a floating magnifying lens followed by a set of boxes and then a floating screen with some figures on it
a dotted path goes to a platform with a floating magnifying lens followed by a set of boxes and then a floating screen with some figures on it

IBM Open Data for Industries is an enterprise-grade platform based on the OSDU (Open Subsurface Data Universe) data foundation. It runs as an IBM Cloud Pak for Data cartridge to:

  1. provide scalability, security, and flexibility.
  2. bring capability to integrate with other Cloud Pak for Data services seamlessly, including Watson Studio (WSL), Watson Machine Learning (WML), Watson Knowledge Catalog (WKC), Watson Discovery, Data Refinery among others, to give customer an end to end IT solution.

IBM Open Data for Industries launched its v1.0.0 release in November 2020, and released v1.1.0 in February 2021. With 1.x.x releases, Open Data for Industries provide…


A real service backup and restore solution

a grassy field with birds on the edge of a body of water with a pier jutting into the water. It is sunset or sunrise
a grassy field with birds on the edge of a body of water with a pier jutting into the water. It is sunset or sunrise

It was the third meeting I had within a week — a customer who use Cloud Pak for Data wanted to have a backup and restore solution to meet their business requests. This meeting was for discussing their backup and restore architecture and proposal and also to answer their technique questions.

When I was the SRE architect on Cloud Pak for Data, meetings like the above happened often. …


How IBM Cloud Pak for Data keeps its quality and reliability

a busy multilayer highway exchange
a busy multilayer highway exchange

IBM Cloud Pak for Data is a comprehensive Data and AI platform that can deployed on any cloud or on premises.

SRE (Site Reliability Engineering) was originally developed at Google to maintain service site reliability and reduce human detectable disruption to a desired objective (Service-Level Objective — SLO). It plays a critical role for Cloud services with regards to their reliability and availability.

Cloud Pak for Data, as an on-premises platform, does not need SRE to maintain its “site” reliability per se. …


Backup& restore versus Export& import: comparisons and differences of these two solutions

a field of golden plants and a field of green plants separated by a green lane with a tree visible with a blue sky and clouds
a field of golden plants and a field of green plants separated by a green lane with a tree visible with a blue sky and clouds

A customer using IBM Cloud Pak for Data wanted to upgrade its release from v2.5.0 to v3.0.1. After carefully planning, they backed up the volumes following the Cloud Pak for Data backup&restore documentation and then ran v2.5.0 to v3.0.1 upgrade. Everything was going well, but then a rack of machines crashed due to power interruption in the beginning of functional verification after upgrade. Even though the nodes restarted quickly, some storage had crashed and some data was consequently lost. They felt lucky that they did backup ahead of…


Cloud Native, CI/CD pipeline, and Agile developments are hot topics within modern software development. The term development teams talk about most is MVP (the minimum viable product). The MVP can be defined for a new launch product, or for a new feature supported with a certain release. No matter whether for a new product or a new feature, the goal of the MVP is to define the minimum features so to develop and release quickly to market to collect feedback from customers. This becomes general practice with all software release cycles across the IT world.

Jingdong Sun

Software Architect and Developer

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store