Anant Corporation Blog

Our research, knowledge, thoughts, and recommendations about building and leading businesses on the Internet.

Category Archives: System


searchstax.solr.drupal.logos

How to Set Up a Drupal Website Connected to a SearchStax Deployment

Basic Steps:

  1. Create a new Deployment in SearchStax
  2. Upload a Custom Configuration to Your Solr Server
  3. Install Drupal
  4. Install Search API / Search API Solr plugin
  5. Configure the Search API Solr Plugin
  6. Add sample content
  7. Optional – Manually Index Your Site

Step 1: Create a New Deployment in SearchStax

Assuming you have already created a SearchStax account and do not already have a deployment set up, click on the Deployments tab and then click on the Add Deployment button at the top.  Enter a Deployment name, and select the most appropriate Region, Plan, and Solr Version for your needs.  In this example we will be using Solr Version 6.4.2.

Once you create your Deployment, you will see it in the Deployments dashboard.

Clicking on the arrow button on the right of the deployment will give you pertinent information about your deployment’s servers.  The Solr Load Balancer URL will bring you to your Solr server dashboard.

Step 2: Upload a Custom Configuration to Your Solr Server

Download the Search API Solr plugin files: https://www.drupal.org/project/search_api_solr

Included in the Search API Solr download are several configurations in the solr-conf folder, with subfolders 4.x, 5.x, and 6.x for the respective Solr versions.  

SearchStax uses Apache ZooKeeper for maintaining configuration information.  Upload the appropriate configuration files via Zookeeper and create a new collection.  If you have your Zookeeper script already, the two commands you will need are as follows:

 

Upload Configuration:

Linux:

zkcli.sh -zkhost <zookeeper URL> -cmd upconfig -confdir <Solr configuration> -confname <configuration name>

Windows:

zkcli.bat -zkhost <zookeeper URL> -cmd upconfig -confdir <Solr configuration> -confname <configuration name>

 

Create New Collection:

Linux:

curl ‘<Load Balancer URL>admin/collections?action=CREATE&name=<collectionName>&collection.configName=<configName>&numShards=1’

Windows:

curl.exe “<Load Balancer URL>admin/collections?action=CREATE&name=<collectionName>&collection.configName=<configName>&numShards=1” -k

 

Detailed instructions on uploading a new configuration and creating a new collection for a Solr deployment can be found here: https://www.measuredsearch.com/docs/

 

Once complete, go into your Solr dashboard. There will be a newly created core based on the name of the collection name you defined when uploading the configuration. Make note of this core name.

Step 3: Install Drupal

If you do not already have one, there are many ways to create your own Drupal instance.  Some web hosting services offer specialized integrations and setup options that help streamline the process.  Here we used GoDaddy’s Cloud Server services, which in a few clicks can create a hosted Drupal website.

Step 4: Install Search API / Search API Solr plugin

Go to your Drupal website and log in.

 

Open up a new tab in your web browser, then go to: https://www.drupal.org/project/search_api.  Once there, scroll down to find the different download links.  Right click on the appropriate Download link to the compressed file, and copy the link address.

In your Drupal site, either navigate to the /admin/modules page or click the Extend tab at the top.  Then click Install New Module.

Paste the link address copied earlier into the “Install from a URL” text field, and then click the Install button.

Once complete, you should see a confirmation message saying that the installation was complete.

If you see this, then you can install the Solr plugin for Search API:

https://www.drupal.org/project/search_api_solr.  Install this in the same way as above.

 

Before continuing, you may need to install the Search API Solr composer dependencies.  See https://www.drupal.org/documentation/install/composer-dependencies for instructions on how to do this.

 

Next, you will need to enable the installed modules.  Click “Enable newly added modules”, or click on the Extend tab.  Scroll down to the section called Search to see new module settings.

Enable the following items: Search API, Solr search, and Solr Search Defaults.

Then click Install at the bottom of the page. You should see this confirmation:

Once complete, it is recommended to uninstall the Solr Search Defaults module, as it may affect performance.  Uninstalling the module will not remove the provided configurations.

Step 5: Configure the Search API Solr Plugin

Now that the modules have been enabled, click on the Configuration tab.  Look for section “SEARCH AND METADATA” and click on Search API to configure it.  

Once there, click Add Server.

Give your server a name, make sure the Backend is set to Solr, and configure your Solr Backend.  Check that HTTP protocol is set to https, the Solr host and Solr port are correct, and you set the Solr core that was created after uploading your Solr configuration.

If your configuration settings are valid, you will see a message saying that the information was saved successfully.

Next you will need to define an Index.  In the Search API configuration screen, click on Add Index.

Give your index a name, and select the Data sources you wish to index.  For this example, select Comment and Content.  Also, at the bottom of the page make sure that you select the Server created earlier.

Step 6: Add Sample Content

 

Install the Devel plugin using the same method as described in Step 4: https://www.drupal.org/project/devel.  Then, enable Devel and Devel generate.

You will see new options in multiple menus.  Go to Manage > Configuration, and scroll down to the Development section.  Here you have options to Generate content via Devel.

Click on Generate Content.

Select a Content Type and enter the number of nodes.  For example, selecting Article and typing “20” nodes will produce 20 new articles filled with dummy data.

Step 7: Optional – Manually Index Your Site

A cron job will periodically index your site automatically, but if you want to see your results immediately go to the Search API configuration screen and click on the index created earlier.  At the bottom, click on the Index now button.

After you begin, you will see a progress bar.  Once it reaches 100%, you will get a Success message.

After your site has been indexed, you can view and query the data in Solr.

Congratulations!  You have now customized your Drupal website to allow for content to be indexed in your SearchStax Solr deployment.

The Swarm of Sources

Reactive Manifesto, The Next VisiCalc, and Future of Business Technology

Thanks to some of our great partnerships, our firm has recently consulted at University of Michigan, Cisco, Intuit, and Kroger and at several government agencies on business information and enterprise technology. Even though we don’t directly create consumer technology or applications, eventually all consumer technologies have a backend enterprise technology that makes it work because a consumer technology company backed by crappy technology for the enterprise is bad for business.

I’ve been sensing a shift in business information for a while. Business information, the frequency it’s created at, the number of sources it comes from is only increasing, exponentially if not logarithmically. This means, that businesses, and subsequently end-users need to rely on real-time data processing and analysis of this information. The businesses that embrace the “reactive manifesto” of how to build software and technology are going to succeed in the new world where data is coming from millions of people through their mobile devices, processes through applications and software processes, information through global data sources and APIs, and systems in the form of servers and things all over the globe. The “swarm” of sources is mind-boggling.
The Swarm of Sources

The Swarm of Sources

The first business response to all this business information is: let’s bring it all together to analyze it and visualize it. That’s horseshit. Even with the big data technologies out there today, it is wasteful to try to process all of it at the same time. That’s like trying to understand how the universe works at every second. The better response is to understand what’s happening and react to it at the moment in the context that it is important.

This reactive methodology of building large infrastructure can help businesses react to new IoT initiatives, integrating with numerous business software to run the modern enterprise, and partnering with other modern enterprises. Whatever you see out there in apps, devices, sites, and APIs has to be managed in the back. The reason for silicon brains is stronger when you just can’t do it with carbon brains. Technology has to get better faster through iterative machine learning in order keep up with the amount of data that’s being created.

Commercial organizations are being thrown sledgehammers to solve things by vendors such as Oracle, Cloudera, MapR, DataBricks, etc. Although these products are great, they are more like Personal Computers .. but without the real “Killer App.” They aren’t solving industry-specific / vertical problems. Consulting companies waste inordinate time & materials costs to get it “right.” What people need are “lego block” software so that non-technical folks can self-service their information needs without hiring a team of data analyst, data engineer, data architect, data scientist, data visualizer, and of course a project manager. (If you do need a team today, Anant provides an elastic team with all of those skills for the same investment per month as a part-time or full-time employee. Message me or my team.)

I believe the major breakthrough that will change the experience for business technology users is going to be system design tools that help them get what they want without knowing how to program. I don’t know what it will look like, but we need a VisiCalc for the new age, and no it’s not Google Spreadsheets. It’s something else altogether. It’s something that will fix the yearning for a tool that helps people mashup and leverage various gradients between structured and unstructured data in dynamic knowledge pages that always keeps us up to date on what we care about. A machine that learns what we need to know and summarizes it for us, but also allows us to manipulate the knowledge even if it is being created in 10 different systems.

Webinar 2: How to Connect Online Business Software 101 (B2B)

Software, software, software everywhere – and applications! Let’s face it, you probably couldn’t run your company without them. At Anant, we help clients connect their different pieces of software, apps and hardware to help make workflows more productive and more efficient. Next week, as part of our four part webinar series, CEO Rahul Singh will present on Enterprise Application Integration (EAI) to help you understand how these connections can be executed and why they are important to your business operations.

 

Even though the term “Enterprise Application Integration” itself seems daunting it is something you will be able to do (or want to do) to unlock opportunities for your company and yourself. Previously, monolithic applications, such as complex enterprise resource planning (ERP) systems, attempted to create large frameworks of applications that would give you an all-in-one solution to your application integration needs. One of the main obstacles with systems such as these is they are largely inflexible. In a business environment where processes, amongst other things, are often prone to change, it is not sustainable to use applications which are inherently rigid.

 

With the advancement of Application Program Interface (API) technology, there is now a plethora of different application integration opportunities. Using APIs, companies have a much more feasible way to get applications to talk with each other, or with a central data warehouse (essentially, a repository for data from operational systems such as marketing, sales, etc.). Most of your common internet applications; such as Google Apps, Salesforce, WordPress, and Jira;  have APIs which can be programmed to suit your needs. However, there are pitfalls you need to avoid when connecting your applications together as well as important best practices to follow.

 

During the webinar Rahul will address benefits, pitfalls, and best practices while walking through the different options currently available for integrations.  Sign up is now open for the Friday, September 16, webinar! We’ll start at 10am and finish about an hour later. This webinar will be a 20-25 minute presentation and demonstration followed by an open-forum discussion around the topic of connecting online business software. We hope to see you there!

 

 


Come see us present in person at one of our upcoming events. If you’re in the DC area next week, Rahul will be moderating the monthly Data Wranglers DC event on Wednesday, September 14, where the presenter will speak about using Spark and Accumulo. You can sign up here.