GithubHelp home page GithubHelp logo

lab02part2-kafka-eyetracking's Introduction

Lab02Part2-kafka-EyeTracking

  • This lab is based on the first part of Lab02Part1 (https://github.com/scs-edpo/lab02Part1-kafka-producer-consumer), which is also a prerequisite for this lab
  • The procedure to run the code is similar to Lab02. We recommend importing the project to IntelliJ and let the IDE handle everything
  • Note that only the new procedures and concepts are described in this lab

Use Case

  • This lab simulates a system where user clicks and eye-tracking data coming from two eye-trackers are streamed
  • The eye-tracking data captures the gazes of two developers doing pair programming
  • We use Kafka producers and consumers to simulate this system

Objectives

  • Experimenting several producers and consumers with different configurations for topics and partitions
  • Hands-on a custom serializer and partitioner
  • Experimenting Kafka rebalancing and how it affects the distribution of partitions among consumers
  • Experimenting offsets and manual offset commits

Overview

This lab consists of two parts.

In the first part, we create two producers (ClickStream-Producer and EyeTrackers-Producer)

In the second part, we consume the messages of the producers using consumers with different configurations. All the consumers are available in the "consumer" Module, within the Package "com.examples"

Running the Docker image

  1. Open a terminal in the directory: docker/.

  2. Start the Kafka and Zookeeper processes using Docker Compose:

    $ docker-compose up
    

Producers

ClickStream-Producer

  • Main Class: com.examples.ClicksProducer

    • Overview: This producer produces click events and sends them through the "click-events" topic.
    • Procedure (#P1):
      • Specify topic

            // Specify Topic
            String topic = "click-events";
      • Read Kafka properties file

           // Read Kafka properties file
           Properties properties;
           try (InputStream props = Resources.getResource("producer.properties").openStream()) {
               properties = new Properties();
               properties.load(props);
           }
        • The following is the content of the used properties file producer.properties
             acks=all
             retries=0
             bootstrap.servers=localhost:9092
             key.serializer=org.apache.kafka.common.serialization.StringSerializer
             value.serializer=com.utils.JavaSerializer
        • Notice that we use a custom (value) serializer (see com.utils.JavaSerializer) to serialize Java Objects before sending them
          • The custom serializer is specified in producer.properties with: value.serializer=com.utils.JavaSerializer
      • Create Kafka producer with the loaded properties

         // Create Kafka producer
         KafkaProducer<String, Clicks> producer = producer = new KafkaProducer<>(properties);
      • For the sake of simulation, delete any existing topic with the same topic name (i.e., click-events) and create a new topic with 1 partition. Note that, we use a single partition inside the "click-events" topic so that all the events will be stored into that unique partition of the "click-events" topic

          /// delete existing topic with the same name
           deleteTopic(topic, properties);
        
           // create new topic with 1 partition
           createTopic(topic, 1, properties);
      • Define a counter which will be used as an eventID

           // define a counter which will be used as an eventID
           int counter = 0;
      • At each random time interval in range [500ms, 5000ms]

        • Generate a random click event using constructor Clicks(int eventID, long timestamp, int xPosition, int yPosition, String clickedElement) (see com.data.Clicks). Note that the counter is used as an eventID
        // generate a random click event using constructor  Clicks(int eventID, long timestamp, int xPosition, int yPosition, String clickedElement)
          Clicks clickEvent = new Clicks(counter,System.nanoTime(), getRandomNumber(0, 1920), getRandomNumber(0, 1080), "EL"+getRandomNumber(1, 20));
        • Send the click event and print the event to the producer console
                // send the click event
                producer.send(new ProducerRecord<String, Clicks>(
                        topic, // topic
                        clickEvent  // value
                ));
        
                // print to console
                System.out.println("clickEvent sent: "+clickEvent.toString());         ```         
        • Increment the counter (i.e., the eventID) for future use
        // increment counter i.e., eventID
           counter++;

Instruction

EyeTrackers-Producer

  • Main Class: com.examples.EyeTrackersProducer
    • Overview: This producer produces gaze events and sends them through the "gaze-events" topic.
    • Procedure (#P2):
      • Specify topic

            // Specify Topic
            String topic = "gaze-events";
      • Read Kafka properties file (similar to Procedure P1)

        • The following is the content of the used properties file producer.properties
             acks=all
             retries=0
             bootstrap.servers=localhost:9092
             key.serializer=org.apache.kafka.common.serialization.StringSerializer
             value.serializer=com.utils.JavaSerializer
             partitioner.class = com.utils.CustomPartitioner
          • Remember that in our use-case, we have 2 eye-trackers, and we would like to store the data from each eye-tracker into a distinct partition. Therefore, we use a custom partitioner (see com.utils.CustomPartitioner) to ensure that the events coming from each eye-tracker are always stored into the same distinct partition
            • Reason: with the default partitioner, Kafka guarantees that events with the same key will go to the same partition, but not the other way around i.e., events with different keys will go always to different partitions. Knowing that events are assigned to partitions as follows: "partitionID = hash(key)%num_partitions", with a low partition number (e.g., num_partitions=2), it is very likely that 2 events with different keys will still go to the same partition.
            • The custom partitioner is specified in resources/producer.properties with: partitioner.class=com.utils.CustomPartitioner
          • Similar to P1, we use a custom (value) serializer (see com.utils.JavaSerializer) to serialize Java Objects before sending them
      • Create Kafka producer with the loaded properties (similar to P1)

      • For the sake of simulation, delete any existing topic with the same topic name (i.e., gaze-events) and create a new topic with 2 partitions (i.e., corresponding to two eye-trackers)

          /// delete existing topic with the same name
           deleteTopic(topic, properties);
        
           // create new topic with 2 partitions
           createTopic(topic, 2, properties);
      • Define a counter which will be used as an eventID

           // define a counter which will be used as an eventID
           int counter = 0;
      • At each 8ms

        • Select a deviceID corresponding to a random eye-tracker (among the two available eye-trackers)

              // select random device
              int deviceID =getRandomNumber(0, deviceIDs.length);
        • Generate a random gaze event using the constructor Gaze(int eventID, long timestamp, int xPosition, int yPosition, int pupilSize) (see com.data.Gaze). Note that the counter is used as an eventID

             // generate a random gaze event using constructor  Gaze(int eventID, long timestamp, int xPosition, int yPosition, int pupilSize)
             Gaze gazeEvent = new Gaze(counter,System.nanoTime(), getRandomNumber(0, 1920), getRandomNumber(0, 1080), getRandomNumber(3, 4));         
        • Send the gaze event and print the event to the producer console. Notice that we use the deviceID as a key in the send method. This deviceID will be mapped to the corresponding partition in the "gaze-events" topic.

                // send the gaze event
                producer.send(new ProducerRecord<String, Gaze>(
                        topic, // topic
                        String.valueOf(deviceID), // key
                        gazeEvent  // value
                ));   
                // print to console
                System.out.println("gazeEvent sent: "+gazeEvent.toString()+" from deviceID: "+deviceID);
                          
        • Increment the counter (i.e., the eventID) for future use (Similar to P1)

Instruction

Consumers

ConsumerForAllEvents

  • Prerequisite for running ConsumerForAllEvents:

  • Main Class: com.examples.ConsumerForAllEvents

    • Overview: this consumer consumes the events coming from both ClickStream-Producer and EyeTrackers-Producer

    • Procedure (#P3):

      • Read Kafka properties file and create Kafka consumer with the given properties

         // Read Kafka properties file and create Kafka consumer with the given properties
         KafkaConsumer<String, Object> consumer;
         try (InputStream props = Resources.getResource("consumer.properties").openStream()) {
             Properties properties = new Properties();
             properties.load(props);
             consumer = new KafkaConsumer<>(properties);
         }
        • The following is the content of the used properties file consumer.properties
               bootstrap.servers=localhost:9092
               key.deserializer=org.apache.kafka.common.serialization.StringDeserializer
               value.deserializer=com.utils.JavaDeserializer
               group.id=grp1
               auto.offset.reset=earliest
        • Notice that we use a custom (value) deserializer (see com.utils.JavaDeserializer) to deserialize Java Objects
      • Subscribe to two topics: "gaze-events" and "click-events". The events in the "gaze-events" topic come from two partitions, while the events in the "click-events" come from one partition only

        // Subscribe to relevant topics
        consumer.subscribe(Arrays.asList("gaze-events","click-events"));
      • Poll new events at specific rate and process consumer records
              // pool new data
              ConsumerRecords<String, Object> records = consumer.poll(Duration.ofMillis(8));
      
              // process consumer records depending on record.topic() and record.value()
              for (ConsumerRecord<String, Object> record : records) {
                  // switch/case
                  switch (record.topic()) {
                      //note: record.value() is a linkedHashMap (see utils.JavaDeserializer), use can use the following syntax to access specific attributes ((LinkedHashMap) record.value()).get("ATTRIBUTENAME").toString(); The object can be also reconstructed as Gaze object
                      case "gaze-events":
                          String value =   record.value().toString();
                          System.out.println("Received gaze-events - key: " + record.key() +"- value: " + value + "- partition: "+record.partition());
                          break;
      
                      case "click-events":
                          System.out.println("Received click-events - value: " + record.value()+ "- partition: "+record.partition());
      
                          break;
      
                      default:
                          throw new IllegalStateException("Shouldn't be possible to get message on topic " + record.topic());
                  }
              }

Instruction

ConsumerForGazeEventsForSingleEyeTracker

  • Prerequisite for running ConsumerForGazeEventsForSingleEyeTracker:

  • Main Class: com.examples.ConsumerForGazeEventsForSingleEyeTracker

    • Overview: consumes the events coming from a single eye-tracker
    • Procedure (#P4):
      • The procedure is similar to Procedure P3, with the following difference:
        • This consumer consumes the events coming from a single eye-tracker (deviceID: 0) (These events were stored in partition "0" within the "gaze-events" topic)
          • This is specified using the following code fragment
             // Read specific topic and partition
             TopicPartition topicPartition = new TopicPartition("gaze-events", 0);
             consumer.assign(Arrays.asList(topicPartition));
             ```

Instruction

ConsumerCustomOffset

  • Prerequisite for running ConsumerCustomOffset:

  • Main Class: com.examples.ConsumerCustomOffset

    • Overview: consumes the events coming from ClickStream-Producer starting from a specific user-defined offset
    • Procedure (#P5):
      • The procedure is similar to Procedure P4, with the following difference:
        • The consumer is subscribed to the topic "click-events"
        • The consumer starts reading events from a specific user-defined offset (i.e., int offsetToReadFrom)
          • This is specified using the following code fragment
          // reading from a specific user defined offset
          int offsetToReadFrom = 5;
          consumer.seek(topicPartition, offsetToReadFrom);

Instruction

rebalancingExample.*

Instructions

singleAcessToPartitionAndRebalancingExample.*

Instructions

customCommit.singleAcessToPartitionAndRebalancingExample.*

Instructions

customCommit.commitLargerOffset.*

Instructions

lab02part2-kafka-eyetracking's People

Contributors

aminobest avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.