2pk03 over AI, ML, BigData and data processing

Posts

Showing posts from October, 2016

FreeIPA and Hadoop Distributions (HDP / CDH)

By Alexander Alten - October 24, 2016

FreeIPA is the tool of choice when it comes to implement a security architecture from the scratch today. I don't need to praise the advantages of FreeIPA, it speaks for himself. It's the Swiss knife of user authentication, authorization and compliance. To implement FreeIPA into Hadoop distributions like Hortonwork's HDP and Cloudera's CDH some tweaks are necessary, but the outcome is it worth. I assume that the FreeIPA server setup is done and the client tools are distributed. If not, the guide from Hortonworks has those steps included, too. For Hortonworks , nothing more as the link to the documentation is necessary: https://community.hortonworks.com/articles/59645/ambari-24-kerberos-with-freeipa.html Ambari 2.4x has FreeIPA ( Ambari-6432 ) support (experimental, but it works as promised) included. The setup and rollout is pretty simple and runs smoothly per Wizard. For Cloudera it takes a bit more handwork, but it works at the end also perfect and well integ

Shifting paradigms in the world of BigData

By Alexander Alten - October 12, 2016

In building the next generation of applications, companies and stakeholders need to adopt new paradigms. The need for this shift is predicated on the fundamental belief that building a new application at scale requires tailored solutions to that application’s unique challenges, business model and ROI. Some things change, and I’d like to point to some of that changes. Event Driven vs. CRUD Software development traditionally is driven by entity-relation modeling and CRUD operations on that data. The modern world isn’t about data at rest, it’s about being responsive to events in flight. This doesn’t mean that you don’t have data at rest, but that this data shouldn’t be organized in silos. The traditional CRUD model is neither expressive nor responsive, given by the amount of uncountable available data sources. Since all data is structured somehow, an RDBMS isn't able to store and work with data when the schema isn't known (schema on write). That makes the use of additional fre