Data Management Blog
Data management, Cloud, Transformation and anything else...

AWS Solutions Architect Associate

Renewed this one. I first took this in 2018 with a renewal in 2020. It's been interesting to see how it has evolved over time. 
First, no immediate pass/fail (ugh). Used to be that they wouldn't score you right away but would at least tell you pass/fail immediately. No longer (some say for security review which might make sense). Tip - it shows up in your account before you get notified, so check that first. 
Second, more big data stuff in there. Kinesis Firehose and Data Streams. Since I'd taken the bid data specialty, this was not a worry for me, but seeing more if it on this exam was new. 
Third, a lot more on cross account handling. Especially things like AWS Organizations and the tools that Organizations makes use of. This is clearly due to the proliferation of multi account customers. 
Fourth, a question on AWS Lake Formation. Did not see this in any of the classes or prep materials. Perhaps a directional indicator for the future? Maybe. #aws #certification #solutionsarchitect #bigdata

LinkedIn Privacy Settings Sitemap

Here is an hierarchical list of LinkedIn privacy settings. For those of you like me who have been frustrated at navigating the labyrinthine list of settings. Originally, I wanted to indicate what the defaults are, but they are not indicated. (mobile app version)

- Account preferences
-    - Name, location and industry
- Site preferences
-    - Language
-    - Content language
-    - Autoplay videos
-    - Showing profile photos
-    -    - Which LinkedIn members’ profile photos would you like to see?
-    - Feed preferences
-    - People also viewed
-    -    - Display “People also viewed” box on your Profile page?
-    - People you unfollowed
-    - Open web links in applications
- Sign in & security
-    - Account access
-    -    - Email addresses
-    -    - Phone numbers
-    -    - Change password
-    -    - Where you are signed in
-    -    - Devices that remember your password
-    -    - Two step verification
-    -    - Application lock
- Visibility
-    - Visibility of profile and network
-    -    - Profile viewing options
-    -    -    - Select what others see when you’ve viewed their profile
-    -    - Story viewing options
-    -    -    - Select what creators see when you’ve viewed their story
-    -    - Edit public profile
-    -    -    - Custom url, content, visibility, badges
-    -    - Who can see or download email addresses
-    -    -    - Who can see [email] on your profile or in approved apps?
-    -    -    - Allow your connections to download your email [email] in their data export?
-    -    - Connections
-    -    -    - Allow your connections to see your connections list
-    -    - Who can see your last name
-    -    -    - Select how your last name appears to others
-    -    - Representing your organization and interests
-    -    -    - Show your name and/or profile information with other content shown on LinkedIn?
-    -    - Profile discovery and visibility off LinkedIn
-    -    -    - Choose whether approved apps and partner services can find and display information from your profile
-    -    - Profile discovery using email addresses
-    -    -    - Who can discover your profile or connect with you if they have your email address?
-    -    - Profile discovery using phone number
-    -    -    - Who can discover your profile or connect with you if they have your phone number?
-    -    - Blocking
-    - Visibility of LinkedIn activity
-    -    - Manage active status
-    -    -    - Who can see that you are currently active while you are using LinkedIn?
-    -    - Share profile updates with your network
-    -    -    - Should we notify your network when your profile is updated or upon work anniversaries?
-    -    - Notify connections when you’re in the news
-    -    -    - Should we notify your connections and followers when you’re mentioned in the news?
-    -    - Mentions or tags
-    -    -    - Allow others to mention or tag you in content posted on LinkedIn
-    -    - Followers
-    -    -    - Choose who can follow your updates
-    -    -    - Make follow primary
- Communication
-    - How you get your notifications
-    -    - On LinkedIn
-    -    - Email
-    -    - Push
-    - Who can reach you
-    -    - Invitations to connect
-    -    -    - Choose who can connect with you
-    -    - Invitations form your network
-    -    -    - Allow your network to send you page invitations to follow companies and organizations?
-    -    -    - Allow your network to send you event invitations?
-    -    -    - Allow your network to send you invitations to subscribe to newsletters?
-    -    - Messages
-    -    -    - Let us know what type of messages you’d prefer to receive
-    -    - Research invites
-    -    -    - Allow LinkedIn or third-party partners to send you invitations for participating in product feedback surveys, market research, and other studies?
-    - Messaging experience
-    -    - Read receipts and typing indicators
-    -    -    - Choose if you want read receipts and typing indicators enabled
-    -    - Messaging suggestions
-    -    -    - Allow LinkedIn to show you Messaging suggestions, some of which are personalized using automated systems to recognize patterns in messages?
- Data privacy
-    - How LinkedIn uses your data
-    -    - Manage your data and activity
-    -    - Get a copy of your data
-    -    - Salary data on LinkedIn
-    -    - Search history
-    -    - Personal demographic information
-    -    - Social, economic, and workplace research
-    -    -    - Can we enable trusted third-party partners to use data about you for social, economic, and workplace research?
-    - Job seeking preferences
-    -    - Job application settings
-    -    - Share you profile when you click Apply for a job
-    -    - Commute preferences
-    -    - Signal your interest to recruiters at companies you’ve created job alerts for
-    -    - Stored job applicant accounts
-    - Other applications
-    -    - Permitted services
-    -    -    - These are the services to which you have granted access to your LinkedIn profile and network data.
-    -    - Microsoft word
-    -    -    - Should we allow Microsoft Word to display your work experience descriptions from your profile in Resume Assistant?
- Advertising data
-    - Advertising preferences
-    -    - Profile data for personalizing ads
-    -    -    - Can we use your profile photo and profile information (like name or company) to personalize the content of ads, such as job ads?
-    -    - Interest categories
-    -    -    - Can LinkedIn use interest categories derived from your profile, actions you have taken on LinkedIn and Bing, and actions by similar members to show you relevant ads, such as job ads?
-    - Data collected on LinkedIn
-    -    - Connections
-    -    -    - Can we use information from your 1st-degree connections to show you more relevant ads, such as job ads?
-    -    - Location
-    -    -    - Can we use your location (postal code or city) to show you more relevant ads, such as job ads?
-    -    - Demographics
-    -    -    - Can we use your age to show you relevant ads?
-    -    -    - Can we use your gender to show you relevant ads?
-    -    - Companies you follow
-    -    -    - Can we use information from the companies you follow to show you more relevant ads, such as job ads?
-    -    - Groups
-    -    -    - Can we use the information from the groups you’ve joined to show you more relevant ads, such as job ads?
-    -    - Education
-    -    -    - What educational information can we use to show you more relevant ads, such as job ads?
-    -    - Job information
-    -    -    - What job related information can use use to show you more relevant ads, such as job ads?
-    -    - Employer
-    -    -    - What employment history can we use to show you more relevant ads, such as job ads?
-    - Third party data
-    -    - Audience insights for websites you visit
-    -    -    - Can we use information (that does not identify you) about your visits to other websites to help them better understand their audiences?
-    -    - Ads outside of LinkedIn
-    -    -    - Can we show you personalized ads outside of LinkedIn?
-    -    - Interactions with businesses
-    -    -    - Can we use your information that you’ve given to businesses to show you more relevant ads, such as job ads?
-    -    - Ad-related actions
-    -    -    - Can we use your information (that does not identify you) about actions you took on ads to let advertisers know aggregate information about their ad performed?

APIs and "fair use"

The Supreme Court sided with Google in its suit over the repurposing of APIs under the aegis of "fair use". They over-turned the lower court decision. This has huge implications for the technical world. As a frequent user of APIs, I found the case very interesting. It revolves around Google's attempt to Clean Room Design from the Java SE edition. Here are some takes: the ruling itself, the facts in non-legal language, the business perspective, the developer perspective, and the legal take.

AWS Certified Big Data Specialty

Lots of hard work to achieve this for sure but well worth it! 

When Push Comes to Shove (or playing cat and mouse with push technology)

When will the end of push happen?  I'm desperately waiting for the humane demise of what has become an inhumane technology.  One that has run amuck.

Push technology initially held so much promise.  A way for me to keep appointments, become aware of significant happenings and stay connected to people.  But the dream has become a nightmare.  Constant interruption and devices that absolutely demand my attention.  I feel like I'm in a game of cat and mouse trying to stay ahead of the coercive effects of push technology.  Am I not the customer?

Example 1, my iPhone.  There are icons on my home screen that cannot be removed.  Applications that I have no interest in and for which I did not ask.  It used to be Apple Watch, now its things like the Wallet.  No matter what I do I can't remove them.  So I'm forced to stare at them every time I open the phone.  My daughter gave me a great solution to this problem.  She said to create a folder on the home screen named "garbage".  This folder can contain any app icons I don't want to see.  So I only have one unwanted icon on my home screen instead of several.  So I now stare at folder "garbage" every time I log in.

I've learned that newer versions of the OS allow you to remove the icons from the home screen but doesn't allow you to uninstall them.  A "half win".

As a business man, I believe this represents a failure to focus on the customer.  These apps on my iPhone are clearly so horrible that no person would voluntarily install this stuff, so the Apple people decided to make them in your face.  If they wanted real success, how about putting it on the App Store and getting good ratings like everything else.  Your app should be able to compete.

Example 2, Facebook notifications.  The use of the notifications feature itself went from useful to tragic.  At first, I got notified if someone mentioned me specifically.  Perfect use case, as I may want to respond.  And I did like this push feature.  But then it took a bad turn and started notifying me of every event under the sun.  It went even further and started notifying me of people posting things seemingly randomly.  Isn't that what the regular feed is already doing?  Why do I need to be notified that someone posted a picture that I am not even tagged in?  So push came to shove and I had to turn them off altogether.   Cat catches mouse.

Example 3, Samsung notifications.  My other phone is a Samsung.  I was trying to fix a Bluetooth connection problem and  started receiving a notification about my "Samsung account".  Well I have no interest in a Samsung account and never asked for it.  So I tried to find a way to turn off the notifications.  After fussing with it for a while, I received another notification that said very rudely "these notifications can't be turned off".  So now I have to live with a constant notification that never goes away?  What an annoyance!  In the end it was Samsung wanting all my personal information at all times.  So I couldn't log in and log out as desired.  I had to log in always or get spam notified.  Push became shove.  So that meant I had to delete my Samsung account altogether.  You lose.

This kind of behavior isn't like bloatware of old.  We've all dealt with bloatware, especially when buying computers.  They are famous for installing lots of junk.  The difference is that you could generally take the time and get rid of it.  But when this flotsam becomes mandatory, it really feels coercive.

Example 4, whenever any app is installed anywhere.  I bet many of you play this game too.  Whenever I install any app, I immediately go to the "settings" and uncheck all the notifications I don't want.  Because the default behavior is to get bombarded with notifications.  The developers don't ever think I have any other applications besides theirs.  Ego centric to say the least.  And some apps will not allow you to turn all of them off, merely reduce the number of them.  So push again becomes shove.  This inevitably leads to removal of the app, or disabling notifications for the app in the operating system itself.  Once again cat and mouse.

As a business person, I end up feeling like I am not the customer any longer.  I feel increasingly like I'm being viewed as merely a gold mine.  To be excavated for good stuff which is then sold to their real customers.  I could never run my business that way and I would love for technology firms to bring the customer back into focus.
Apps that don't annoy you into compliance but help you get something done.
Apps where the user is in control.
Apps that are designed for your benefit.
That way we don't have to play cat and mouse.

AWS CloudFormation: reluctantly embracing YAML

Yet Another Markup Language?  Really?
That was my view of YAML for a very long time.  As someone who has no problem with various formats for structured information like xml, json, etc, I couldn't see any real use case for making another one.  And then came AWS CloudFormation.
AWS CloudFormation uses templates for provisioning assets in the cloud.  I'd been working on a problem and it involved creating a Lambda function.  This in tern means creating and managing an IAM Role and Policy.  It also played into some S3 for storage (trying to go full serverless).  And so the CF template was quickly getting sizable.  And since there is no CF schema to work with (although there are attempts at such), there is no authoring "helper" that can be derived from the governing schema.  Which is why one can author easily in Xml no matter how large because the schema can help you and authoring tools.
Having used JSON for a long time (and even JSON Schema too), I was working with that syntax in CF.  The tools support for JSON is good and make it easier to work with, especially features like code folding.  But when you scale JSON larger and larger, it starts to become unwieldy.  Don't get me wrong, I've dealt with JSON files many megabytes in size before.  So it isn't just a matter of size. With large JSON files that are consistent and repetitive, you can fold code/copy code and manage it not unlike Xml.
However with CF, I was getting frustrated at the process of brute force trial an error I was involved in.  (Ok, I know there are tons of template snips out there - and I was using them - but nothing did EXACTLY what I was trying to do.  So they were all "sorta" useful.)  I finally broke down and said, I'll try something new just to see if it has value.  And I dug into YAML for CF templates.  Structured, hierarchical, fold-able code is very familiar, so the literals of YAML were no problem.  But was it any better than JSON?
Enter Notepad++.  This tool brought the features needed to make YAML work just like JSON and Xml.  First and foremost code folding.  So I was more easily able to take code snippets and work them into my CF template.  Testing along the way.  And I began to like YAML.  Eminently more readable and more compact in terms of fitting code on a viewable page, it allowed me to visualize more of what I was authoring.  That has proven key to my progress.  And the explicit formatting of white space allows for easy viewing of indentation.  It was starting to look a lot like Python.
So I've come around to this new "Python-like" markup known as YAML.  It does have a use case that I can get behind.  Call me a reluctant embracer of YAML.

Certified Solutions Architect

Passed the exam today.  Really pleased at reaching this goal.  I guess I've gone from knowing just enough to be dangerous to being actually dangerous!   The test was challenging but also about what I'd expected.   My road to certification started with a great course from Ryan Kroonenburg of ACloudGuru.  He not only knows his stuff, but he is an excellent communicator.  It can sometimes be hard to get both in one person.  I also had some key resources that helped.  First, beyond the course itself, was the official study guide from AWS.  It was originally published in October 2016, so some parts are dated.  But for the most part it was an excellent resource.  I also read that the folks who wrote the book also wrote the exam (haven't verified this myself).  Second, I bought some quizzes from Whizlabs.  They have 8 full length tests that are challenging and informative.  They help identify weak points to study.  In addition, I got some tests from IAASAcademy.  Those were also good, although I thought the Whizlabs were better.  And lastly (but certainly not least in importance) was the AWS documents, including the FAQs for each service.  Very useful resources for sure.  (Oh and of course the labs.)
So now that I hit my goal, it's time to do some damage.

Escapades in AWS Certification

Adventures in cloud computing.  Like many folks, I've been reading about this new game changer for some time.  And been experimenting with some of the tools.  S3 storage for static web hosting: check.  EC2 instances for managing compute tasks: check.  Even trying some auto scaling which is one key item that make so much sense.  I think I'd learned just enough to know I was dangerous.
I had some ideas as to how I might like to make use of cloud technology. Migrate some of my tools to AWS and leverage Lambda functions in the design of how they would work.  Lambda is the game changer with in the game changer.  I can see the amazing potential.  Wouldn't need to start out with it at first, but eventually get there.
After playing around in my own sandbox, I concluded there is no better way to learn than to get dirtier and get certified.  So I embarked on a certification exam for Solution Architect.
One can't research too long before finding that A Cloud Guru stands out as a leader in the teaching of this material.  I found a very inexpensive course on and started my adventure.  More to come. 

Xml, JSON, and Darwinian competition

Recently gave a presentation on the relationship between JSON and Xml technologies.  I'd set it in the context of "friend or foe" as there are lots of people who frame the relationship between these two as some sort of competition in a zero sum game or a Darwinian death match.  On the one hand, Xml as the incumbent who is trying to fend off the nipping upstart, saying that JSON simply isn't a king killer.  On the other hand, the insurgent JSON is poised to topple the bloated, over-the-hill, yesterday technology.  Wresting the title from reluctant dinosaurs.

Having worked in both data integration as well as content management spaces, I've seen both natures and how they react to Xml and JSON.  I think the former are very hot on the JSON track, and rightly so.  With cloud applications, bandwidth is now an issue again.  And then there's mobile applications.  Light weight, simply data structures for not overly complex data can make a huge beneficial difference.  So JSON will continue to have an increasing role there.

The content folks see value but are a little less keen on the the JSON value proposition. An example of some skepticism is the concept mixed content (elements intermingling with text).  This is a big, bright line that differentiates the two technologies.  Having tried several methods to work with this myself, I find that Xml's inherent support for mixed content is a really nice relief.  And content management will tend to run into mixed content more frequently than data integration specialists.  Still, content folks see some value in JSON for sure.  They don't like Xml's bloat any more than anyone else.

Ultimately however, this isn't a death match.  The Darwin analogy doesn't mean there can be only one survivor.  But an array of creatures that each have their strengths and weaknesses.  Like programming languages or Galapagos island animals, there is room for many.  I like JSON and find it very useful and fast for development.  I've developed JSON applications and experimented with JSON Schema in fact.  (More to come on this in another post.)  And when I come into contact with complex content structures or mixed content of any kind, I'm glad Xml is still in the toolbox.

Bitcoin - is there any "there" there?

Reading quite a bit recently about the technology that is popularly known as Bitcoin.  The use of block chain computing power to solve mathematical problems in return for money, to put it bluntly.  Article after article spoke to how this can be a trans-formative technology.
Fair enough.  Time to investigate and see how it is supposed to make people money.  I thought of 2 angles to try out.  First, the easiest is to think of it technologically.  And use computing power to test out how things work and how useful it proves.  Seems the things to do are setting up a wallet (after all I need a wallet to store all my major bucks right?).  Then start "mining" the math to create my way to wealth.  I installed Bitcoin Core wallet for windows.  Installed and seems to be about what I expected and read about.   Next I installed GUI Miner which is a client that does the computing. So if I'm mining for bitcoin gold, where do I land my first shovel?  In order to find a place to squat and stake a claim, its best to follow someone who knows.

So enter Slush Pool.  Pools are ways of aggregating computing power with a shared reward. Slush's Pool claims to be the world's first mining pool.  Its at least a place to start.  Soon I'm set up with a wallet and I'm using GUI miner to mine coins in Slush's pool.  So I sit back and let my computer make me money, right?!  Seeing the early returns, it's clear that it will take a very long time to make any money this way.  Can I reduce my overhead to maximize my margins?  

Researching pools, one quickly gets into issues of governance.  The competition to attract miners leads to claims of transparency and low cost pool providers.  (An interesting view that money creates government instead of the other way around. :) )  Being an advocate of a Vanguard investment philosophy, I view the strategy of keeping overhead low and I'll beat the higher cost guys most of the time without even trying.  But it becomes apparent the global nature of this setting.  I'm not only choosing a pool that may claim to have low overhead fees, but my mining efforts are competing against third world cost structures.  Calculators spring up to tell you how your costs affect your mining potential.

In fact, in mining, the calculations of benefit soon become a discussion of your electricity rates.  Since coins are minted using computing power, one needs to factor in electricity costs in your potential profit margins.  But since "1"s and "0"s do not recognize borders, my first world electricity costs quickly mean I'm competing against inherently cheaper places around the world.  This means I'm starting to sour on raw mining for profit.  The margins simply aren't there unless I can employ an armada of machines at third world electricity rates.

So I've learned about the technology that makes it work and I've learned that the mechanics of mining mean one will never get rich that way and this task is better left to low overhead miners.  What about a more philosophical or entrepreneurial view?  (Meaning I want to own my own pool.)  Where is the opportunity to put this technology to something different or in line with goals.  Can it be leveraged to solve a bigger problem?  I'd like to see this applied to something useful like fighting malaria or some kind of important goal.  Pools to simply make money are obvious and already exist.  Another one won't stand out.  Creating a pool that attracts investors (miners) for some motivation other than simply making money might do the trick. Indeed some of the pools are motivated by philosophies that attract a certain motivated miner.  This remains my landing point in this story. 

I'm left intrigued with Bitcoin (and the underlying block chain technology) even if I've not found a path to riches nor used it to solve a bigger problem.  The fact that it is making some inroads to mainstream usage and acceptance means it isn't a fad.  The technology is interesting and I can understand the attraction.  So there is some "there" there.  I'm just not sure where this fits into my strategies as yet.  Perhaps you'll find me next announcing a new mining pool that will plow all profits into fighting malaria.

replace front disc brake pads 1997 Toyota Camry

I've been in car maintenance mode as you can probably tell.  This time, its been a long standing issue. Quite frankly these brakes have been squeaking ever since I got the car.  Very annoying and actually embarrassing when driving friends.  I was told when I got it that the brakes were not that old and because of the squeaking the mechanic put in the exact OEM pads for this car.
So are they just worn out?  Are the slider pins needing lubrication? Something else?  As it turns out I think the problem was the pads.  They were not worn down all the way.  But they are metallic pads.  I switched to ceramic and this made all the difference.  Here is how I did it.

1997 Toyota Camry radiator replacement

Here is another item in the "anything else" category.  Recently had some car trouble and make a short video of a repair I did.  I have a Toyota Camry 1997 and I was getting P0115 error codes from my ODB 2 reader.  As I was about to replace the ECT coolant temperature sensor, the radiator showed to be leaking.  So I ended up replacing the radiator.  This video shows how I did it.

URIResolver with XSLT2 using Saxon on Tomcat and JSTL

Ran into a rather maddening problem last week.  I was working on a front end to a tool and was planning on using JSP within a Tomcat environment.  I'd downloaded the latest Tomcat (8.0.9 to be exact).  It installed ok.  Well most of my app is xml based and I needed to use XSLT 2.  So grabbed Saxon 9 ( - later tried and same behaviour), and added to my lib directory and with an environment variable update, presto - I was able to perform transformations.  (Just needed a property set in the JSP)
System.setProperty("javax.xml.transform.TransformerFactory", "net.sf.saxon.TransformerFactoryImpl");
So far a happy story right?  The issue came up around relative and absolute paths.  The collection() function was throwing errors if I tried to use a relative path.  It was annoying but not the end of the world as I could supply the full path behind the scenes.  Maddeningly, the doc() function was throwing an error if I used an absolute path.  So I had 2 functions that were each doing their own thing at different places in the tool.  But one required full path and one relative.  No exceptions.  I could work around this but it seemed silly to have to do this.

I wasn't sure if the problem was my code, java, tomcat, or saxon (can you guess which?).  I found that it wasn't anything to do with encoding, so I could rule that out.  I started doing research and found some interesting (though dated) discussions here , here , here .  The issue was apparently around the URIResolver. Potential work arounds/solutions here , here , here , here .  

I also did some document-uri(.) functions on loaded documents.  It reflected this problem, as it was returning a path that started with "jstl:/../" instead of  "file:///c:/" or even "http://localhost".  So the resolver was definitely the problem.

Just as I was about to contemplate writing a custon URIResolver, I did some more digging in my JSTL tagging.  And it hit me that I might have outsmarted myself.  Turns out that @xsltSystemId not only provides the path for the XSLT, but also serves as the basis for all relative URLs used in the XSLT. So things like imports, doc(), collection(), etc.  They all are based on that.  So my solution was a humbling, simple attribute on my JSTL:

<x:transform xml="${thexml}" xslt="${xslt}"


Here is more info on the errors I'd gotten:

When a known and correct relative path was used, the collection() function resulted in this error (snipped for brevity):
HTTP Status 500 - javax.servlet.ServletException: javax.servlet.jsp.JspException: net.sf.saxon.trans.XPathException: Cannot resolve relative URI: Invalid base URI: Expected scheme-specific part at index 5: jstl:: jstl:
Meanwhile when the full path is given, and the collection() function works correctly, later on in the process, the same full path in doc() function returns this error (also snipped):
HTTP Status 500 - java.lang.IllegalArgumentException: Expected scheme-specific part at index 5: jstl:

java.lang.IllegalArgumentException: Expected scheme-specific part at index 5: jstl: Source) Source)

The Definitive Guide to TMS Top 5 Lists

This goes into the "anything else" category. One of my other passions is rock and roll music. And many people who hang around the worlds of hard rock and heavy metal are aware of a VH1 classic TV show called "That Metal Show" (@ThatMetalShow  #TMS). Hosted by Eddie Trunk, Don Jamieson, and Jim Florentine, the show has become a focal point of discussion, awareness, and fun around this music genre.

The show has numerous segments, but the one that struck me is the "Top 5" lists. Host selected topics are debated and a "final" list is determined. This blog post is to show you how diligent (or perhaps crazy) I've taken interest in these lists. I've researched the shows and found that no where was there a definitive list of Top 5 lists. So I created one myself!

Here is the link to The Definitive Guide to TMS Top 5 Lists. You can use this blog post to add comments (or to help me track down the few episodes for which I cannot find the list). Enjoy!
© Copyright Paul Kiel.

Older posts