Data storage

Home / Data storage
The Web Analytics use case demonstrated how data can be collected from a website. This use case shows how that data can be stored. Please note that you need access to the environments that are used here. This means that those environments need to have IP whitelisting disabled, the proper ports of the server open and the user used in the connection needs to have the proper rights.

Add the code below to the Global concept from the Web Analytics example (replace the credentials of the FTP server) to store the data in a log file

flow
=> filter['url','pageName','visitorId','sandboxID','currentTime','currentDate']
=> ftp[
	host = "FTP server address", 
	user = "username", 
	password = "password", 
	dir = "/path/to/destination",
	flush = "true"
]
The flow element describes data processing which does the following:

  • The filter flow element is not only used to filter data tuples, but also to order them. This is used to order the fields to the right column order in the log file
  • Data is sent to the specified FTP location, and stored in output.log (default file name).

A big advantage of the DimML language is that it facilitates parallel computing. It is possible at any given time to distribute the data to multiple end points in parallel. Below is an example of not only send the output to an FTP environment but also to a MySQL database and a mail. Note that the parallel nature of the solution results in the fact that any error in one of the distribution flows does not lead to data loss or other errors in one of the other end points.

flow
=> filter['url','pageName','visitorId','sandboxID','currentTime','currentDate'] (datadistribution)

//by adding (datadistribution) at the end of the flow element, the data tuple at that stage is distributed all flows with that name
//since we use several flows here, each flow is executed in parallel

flow (datadistribution)
=> ftp[
	host = "FTP server address", 
	user = "username", 
	password = "password", 
	dir = "/path/to/destination",
	flush = "true"
]

flow (datadistribution)
=>mongo[
	uri = 'mongodb://user:password@ds054118.mongolab.com:54118/data',
	db = 'data',
	collection = 'web'
   	key = 'visitorId,currentData,currentTime',
	set = `[
		url: url,
		views: views + 1
	]`@groovy,
	inc = `[
		counterViews: views,
	]`@groovy
]

flow (datadistribution)
=> code[
	to = `'you@yourdomain.com'`,
	subject = `'DimML Mail services'`,
	mime = `'text/html'`,
	text = `'<html><head><title>DimML services</title></head><body>You have a visitor on your website!</body></html>'`
]
=> mail[`EMAIL_OPTIONS`]

const EMAIL_OPTIONS = `{
	username: 'userXXX',
	password: 'passwordXXXX',
	from: 'mail@dimml.io',
	auth: true,
	'starttls.enable': true,
	host: 'email-smtp.domain.com',
	port: 123
}`

Additional assignment

  1. Extend your DimML application such that data is also sent to a log file
  2. Extend your DimML application such that data is also sent to a database