Log Analysis Scripts
Log analysis is a critical DevOps skill. Learn to parse, filter, and analyze log files to extract meaningful information.
Basic Log Parsing
Extract key information from logs:
#!/bin/bash
LOG_FILE="${1:-/var/log/syslog}"
# Count total lines
echo "Total entries: $(wc -l < "$LOG_FILE")"
# Count errors
echo "Errors: $(grep -ci 'error\|fail' "$LOG_FILE")"
# Count warnings
echo "Warnings: $(grep -ci 'warn' "$LOG_FILE")"
Filtering Log Entries
#!/bin/bash
# By log level
grep "ERROR" app.log
grep -E "ERROR|WARN" app.log
# By time range
grep "2024-01-15 1[0-2]:" app.log # 10:00-12:59
# By pattern
grep -E "user=[a-z]+" app.log
# Exclude patterns
grep -v "DEBUG" app.log # Everything except DEBUG
Exercise: Count Errors
Count error occurrences:
Extracting Fields with awk
#!/bin/bash
# Apache log: IP - - [date] "request" status size
# Top 10 IPs by request count
awk '{print $1}' access.log | sort | uniq -c | sort -rn | head -10
# Requests returning 404
awk '$9 == 404 {print $7}' access.log
# Total bytes transferred
awk '{sum += $10} END {print "Total bytes:", sum}' access.log
Time-Based Analysis
#!/bin/bash
# Requests per hour
awk '{
split($4, time, ":")
hour = time[1]
hours[hour]++
}
END {
for (h in hours) print h, hours[h]
}' access.log | sort -n
# Peak hours (simplified)
grep -o '[0-2][0-9]:' app.log | sort | uniq -c | sort -rn | head -5
Error Pattern Analysis
#!/bin/bash
LOG_FILE="$1"
echo "=== Error Summary ==="
# Group errors by type
grep -i error "$LOG_FILE" | \
sed 's/.*ERROR[: ]*//' | \
sort | uniq -c | sort -rn | head -10
echo ""
echo "=== Recent Errors ==="
grep -i error "$LOG_FILE" | tail -5
Exercise: Extract IPs
Get unique values from log:
Real-Time Log Monitoring
#!/bin/bash
# Follow log file
tail -f /var/log/app.log
# Follow with filtering
tail -f /var/log/app.log | grep --line-buffered "ERROR"
# Follow with highlighting
tail -f /var/log/app.log | grep --color=always -E "ERROR|$"
Complete Log Analysis Script
#!/bin/bash
set -euo pipefail
LOG_FILE="${1:-}"
[ -f "$LOG_FILE" ] || { echo "Usage: $0 <logfile>"; exit 1; }
echo "================================"
echo "Log Analysis Report"
echo "File: $LOG_FILE"
echo "Generated: $(date)"
echo "================================"
echo ""
echo "=== Overview ==="
echo "Total lines: $(wc -l < "$LOG_FILE")"
echo "Date range: $(head -1 "$LOG_FILE" | grep -oE '[0-9]{4}-[0-9]{2}-[0-9]{2}') to $(tail -1 "$LOG_FILE" | grep -oE '[0-9]{4}-[0-9]{2}-[0-9]{2}')"
echo ""
echo "=== Log Levels ==="
for level in ERROR WARN INFO DEBUG; do
count=$(grep -c "$level" "$LOG_FILE" 2>/dev/null || echo 0)
printf "%-10s %d\n" "$level:" "$count"
done
echo ""
echo "=== Top Errors ==="
grep ERROR "$LOG_FILE" | \
sed 's/.*ERROR[: ]*//' | \
sort | uniq -c | sort -rn | head -5
echo ""
echo "=== Hourly Distribution ==="
grep -oE '[0-9]{2}:[0-9]{2}' "$LOG_FILE" | \
cut -d: -f1 | sort | uniq -c | \
awk '{printf "%s:00 - %s\n", $2, $1}'
Performance Metrics
#!/bin/bash
# Response time analysis (assuming format includes time in ms)
awk '/response_time=/ {
match($0, /response_time=([0-9]+)/, arr)
times[NR] = arr[1]
sum += arr[1]
count++
}
END {
asort(times)
print "Count:", count
print "Average:", sum/count, "ms"
print "P50:", times[int(count*0.5)]
print "P99:", times[int(count*0.99)]
}' access.log
Alerting on Patterns
#!/bin/bash
LOG_FILE="/var/log/app.log"
ALERT_THRESHOLD=10
CHECK_INTERVAL=60
while true; do
# Count errors in last minute
ERROR_COUNT=$(find "$LOG_FILE" -mmin -1 -exec grep -c ERROR {} \; 2>/dev/null || echo 0)
if [ "$ERROR_COUNT" -gt "$ALERT_THRESHOLD" ]; then
echo "ALERT: $ERROR_COUNT errors in last minute!" >&2
# Send notification
# mail -s "Error Alert" admin@example.com < /dev/null
fi
sleep "$CHECK_INTERVAL"
done
Key Takeaways
- Use
grepfor filtering,awkfor field extraction sort | uniq -c | sort -rnis powerful for countingtail -ffor real-time monitoring- Parse timestamps for time-based analysis
- Group and count patterns to find trends
- Create reusable analysis scripts
- Combine tools with pipes for complex analysis
- Consider log rotation in long-running monitors

