Intellij debugging local apache flume with netcat source and logger sink

1. Download apache-flume-1.6.0-bin.tar.gz, un-compress and run normally flume using bash start script

me@MacBook:~/Downloads/apache-flume-1.6.0-bin$ bin/flume-ng agent \
  --conf conf --conf-file conf/example.conf --name a1 \
  -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true \
  -Dorg.apache.flume.log.rawdata=true

with netcat source and logger sink

me@MacBook:~/Downloads/apache-flume-1.6.0-bin$ cat conf/example.conf
# example.conf: A single-node Flume configuration

# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444

# Describe the sink
a1.sinks.k1.type = logger

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c

2. Examine process command via ps -ef or watch the logs:

+ exec /Library/Java/JavaVirtualMachines/jdk1.8.0_241.jdk/Contents/Home/bin/java -Xmx20m -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true -cp '/Users/me/Downloads/apache-flume-1.6.0-bin/conf:/Users/me/Downloads/apache-flume-1.6.0-bin/lib/*' -Djava.library.path= org.apache.flume.node.Application --conf-file conf/example.conf --name a1

3. Add debugging options -Xdebug -Xrunjdwp:server=y,transport=dt_socket,address=5005,suspend=y:

me@MacBook:~/Downloads/apache-flume-1.6.0-bin$ /Library/Java/JavaVirtualMachines/jdk1.8.0_241.jdk/Contents/Home/bin/java \
  -Xdebug -Xrunjdwp:server=y,transport=dt_socket,address=5005,suspend=y \
  -Xmx20m -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true \
  -Dorg.apache.flume.log.rawdata=true \
  -cp /Users/me/Downloads/apache-flume-1.6.0-bin/conf:/Users/me/Downloads/apache-flume-1.6.0-bin/lib/* \
  -Djava.library.path= org.apache.flume.node.Application \
  --conf-file conf/example.conf --name a1
Listening for transport dt_socket at address: 5005

or when using kerberized kafka:

/Library/Java/JavaVirtualMachines/jdk1.8.0_241.jdk/Contents/Home/bin/java \
-Xdebug -Xrunjdwp:server=y,transport=dt_socket,address=5005,suspend=y \
-Xmx20m -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true \
-Dorg.apache.flume.log.rawdata=true\
-Dflume.log.dir=/tmp/flume -Dagent.id=localhost \
-Djava.security.auth.login.config=/Users/me/jaas.conf \
-Djava.security.krb5.conf=/Users/me/krb5.conf \
-cp '/Users/me/apache-flume-1.6.0-bin/conf:/Users/me/apache-flume-1.6.0-bin/lib/*' \
org.apache.flume.node.Application --conf-file conf/kafka-dev.conf --name a1

4. Create or use maven project with flume-ng-core dependency:

<dependency>
  <groupId>org.apache.flume</groupId>
  <artifactId>flume-ng-core</artifactId>
  <version>1.6.0</version>
</dependency>

5. In Intellij create a debug configuration:

and click debug to get connected to the process:

6. Set the breakpoint in the process method of org.apache.flume.sink.LoggerSink:

7. Start netcat and send sample text:

me@MacBook:~$ telnet localhost 44444
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
aaaaa
OK

Leave a comment